Filtering duplicate items over distributed data streams

Xia, Tian, Jin, Cheqing, Zhou, Xiaofang and Zhou, Aoying (2005). Filtering duplicate items over distributed data streams. In: Wenfei Fan, Zhaohui Wu and Jun Yang, Lecture Notes in Computer Science. Advances in web-age information management: 6th International Conference, WAIM 2005. Sixth International Conference on Web-Age Information Management (WAIM 2005), Hangzhou, China, (779-784). 11-13 October 2005. doi:10.1007/11563952


Author Xia, Tian
Jin, Cheqing
Zhou, Xiaofang
Zhou, Aoying
Title of paper Filtering duplicate items over distributed data streams
Conference name Sixth International Conference on Web-Age Information Management (WAIM 2005)
Conference location Hangzhou, China
Conference dates 11-13 October 2005
Proceedings title Lecture Notes in Computer Science. Advances in web-age information management: 6th International Conference, WAIM 2005   Check publisher's open access policy
Journal name Advances in Web-Age Information Management, Proceedings   Check publisher's open access policy
Series Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Place of Publication Berlin; Heidelberg, Germany
Publisher Springer Berlin/Heidelberg
Publication Year 2005
Sub-type Fully published paper
DOI 10.1007/11563952
Open Access Status DOI
ISBN 3540292276
ISSN 0302-9743
1611-3349
Editor Wenfei Fan
Zhaohui Wu
Jun Yang
Volume 3739
Start page 779
End page 784
Total pages 6
Language eng
Abstract/Summary In recent years many real time applications need to handle data streams. We consider the distributed environments in which remote data sources keep on collecting data from real world or from other data sources, and continuously push the data to a central stream processor. In these kinds of environments, significant communication is induced by the transmitting of rapid, high-volume and time-varying data streams. At the same time, the computing overhead at the central processor is also incurred. In this paper, we develop a novel filter approach, called DTFilter approach, for evaluating the windowed distinct queries in such a distributed system. DTFilter approach is based on the searching algorithm using a data structure of two height-balanced trees, and it avoids transmitting duplicate items in data streams, thus lots of network resources are saved. In addition, theoretical analysis of the time spent in performing the search, and of the amount of memory needed is provided. Extensive experiments also show that DTFilter approach owns high performance.
Subjects E1
280108 Database Management
700103 Information processing services
Q-Index Code E1
Q-Index Status Provisional Code
Institutional Status UQ
Additional Notes Book Series: Lecture Notes in Computer Science; Title: Advances in Web-Age Information Management

 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in Thomson Reuters Web of Science Article
Scopus Citation Count Cited 4 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Fri, 24 Aug 2007, 07:24:49 EST