Learning naive Bayes classifiers from positive and unlabelled examples with uncertainty

He, Jiazhen, Zhang, Yang, Li, Xue and Shi, Peng (2012) Learning naive Bayes classifiers from positive and unlabelled examples with uncertainty. International Journal of Systems Science, 43 10: 1805-1825. doi:10.1080/00207721.2011.627475


Author He, Jiazhen
Zhang, Yang
Li, Xue
Shi, Peng
Title Learning naive Bayes classifiers from positive and unlabelled examples with uncertainty
Journal name International Journal of Systems Science   Check publisher's open access policy
ISSN 0020-7721
1464-5319
Publication date 2012-10-01
Year available 2011
Sub-type Article (original research)
DOI 10.1080/00207721.2011.627475
Volume 43
Issue 10
Start page 1805
End page 1825
Total pages 21
Place of publication Abingdon, Oxfordshire, United Kingdom
Publisher Taylor & Francis
Collection year 2013
Language eng
Abstract Traditional classification algorithms require a large number of labelled examples from all the predefined classes, which is generally difficult and time-consuming to obtain. Furthermore, data uncertainty is prevalent in many real-world applications, such as sensor network, market analysis and medical diagnosis. In this article, we explore the issue of classification on uncertain data when only positive and unlabelled examples are available. We propose an algorithm to build naive Bayes classifier from positive and unlabelled examples with uncertainty. However, the algorithm requires the prior probability of positive class, and it is generally difficult for the user to provide this parameter in practice. Two approaches are proposed to avoid this user-specified parameter. One approach is to use a validation set to search for an appropriate value for this parameter, and the other is to estimate it directly. Our extensive experiments show that the two approaches can basically achieve satisfactory classification performance on uncertain data. In addition, our algorithm exploiting uncertainty in the dataset can potentially achieve better classification performance comparing to traditional naive Bayes which ignores uncertainty when handling uncertain data.
Keyword Positive unlabelled learning
Uncertain data
Naive Bayes
Positive naive Bayes
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ
Additional Notes Version of record first published: 26 Oct 2011

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2013 Collection
School of Information Technology and Electrical Engineering Publications
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 5 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 8 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Sat, 13 Apr 2013, 10:17:34 EST by Dr Xue Li on behalf of School of Information Technol and Elec Engineering