Ambiguous decision trees for mining concept-drifting data streams

Liu, Jing, Li, Xue and Zhong, Weicai (2009) Ambiguous decision trees for mining concept-drifting data streams. Pattern Recognition Letters, 30 15: 1347-1355. doi:10.1016/j.patrec.2009.07.017

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads

Author Liu, Jing
Li, Xue
Zhong, Weicai
Title Ambiguous decision trees for mining concept-drifting data streams
Journal name Pattern Recognition Letters   Check publisher's open access policy
ISSN 0167-8655
Publication date 2009-11-01
Sub-type Article (original research)
DOI 10.1016/j.patrec.2009.07.017
Volume 30
Issue 15
Start page 1347
End page 1355
Total pages 9
Editor T.K. Ho
G. Sanniti di Baja
Place of publication Amsterdam, Netherlands
Publisher Elsevier
Collection year 2010
Language eng
Subject 080109 Pattern Recognition and Data Mining
890205 Information Processing Services (incl. Data Entry and Capture)
Abstract In real world situations, explanations for the same observations may be different depending on perceptions or contexts. They may change with time especially when concept drift occurs. This phenomenon incurs ambiguities. It is useful if an algorithm can learn to reflect ambiguities and select the best decision according to context or situation. Based on this viewpoint, we study the problem of deriving ambiguous decision trees from data streams to cope with concept drift. CVFDT (Concept-adapting Very Fast Decision Tree) is one of the most well-known streaming data mining methods that can learn decision trees incrementally. In this paper, we establish a method called ambiguous CVFDT (aCVFDT), which integrates ambiguities into CVFDT by exploring multiple options at each node whenever a node is to be split. When aCVFDT is used to make class predictions, it is guaranteed that the best and newest knowledge is used. When old concepts recur, aCVFDT can immediately relearn them by using the corresponding options recorded at each node. Furthermore, CVFDT does not automatically detect occurrences of concept drift and only scans trees periodically, whereas an automatic concept drift detecting mechanism is used in aCVFDT. In our experiments, hyperplane problem and two benchmark problems from the UCI KDD Archive, namely Network Intrusion and Forest CoverType, are used to validate the performance of aCVFDT. The experimental results show that aCVFDT obtains significantly improved results over traditional CVFDT.
Keyword Decision trees
Concept drifting
Incremental learning
Data streams
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 14 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 22 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Mon, 01 Mar 2010, 15:31:39 EST by Dr Xue Li on behalf of School of Information Technol and Elec Engineering