AML: efficient Approximate Membership Localization within a web-based join framework

Li, Zhixu, Sitbon, Laurianne, Wang, Liwei, Zhou, Xiaofang and Du, Xiaoyong (2013) AML: efficient Approximate Membership Localization within a web-based join framework. IEEE Transactions On Knowledge and Data Engineering, 25 2: 298-310. doi:10.1109/TKDE.2011.178

Author Li, Zhixu
Sitbon, Laurianne
Wang, Liwei
Zhou, Xiaofang
Du, Xiaoyong
Title AML: efficient Approximate Membership Localization within a web-based join framework
Journal name IEEE Transactions On Knowledge and Data Engineering   Check publisher's open access policy
ISSN 1041-4347
Publication date 2013-02
Sub-type Article (original research)
DOI 10.1109/TKDE.2011.178
Volume 25
Issue 2
Start page 298
End page 310
Total pages 13
Place of publication Piscataway, NJ, United States
Publisher Institute of Electrical and Electronics Engineers
Collection year 2014
Language eng
Abstract In this paper, we propose a new type of Dictionary-based Entity Recognition Problem, named Approximate Membership Localization (AML). The popular Approximate Membership Extraction (AME) provides a full coverage to the true matched substrings from a given document, but many redundancies cause a low efficiency of the AME process and deteriorate the performance of real-world applications using the extracted substrings. The AML problem targets at locating nonoverlapped substrings which is a better approximation to the true matched substrings without generating overlapped redundancies. In order to perform AML efficiently, we propose the optimized algorithm P-Prune that prunes a large part of overlapped redundant matched substrings before generating them. Our study using several real-word data sets demonstrates the efficiency of P-Prune over a baseline method. We also study the AML in application to a proposed web-based join framework scenario which is a search-based approach joining two tables using dictionary-based entity recognition from web documents. The results not only prove the advantage of AML over AME, but also demonstrate the effectiveness of our search-based approach.
Keyword Web-based join
Approximate membership location
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ
Additional Notes Published online: 5 August 2011.

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2014 Collection
School of Information Technology and Electrical Engineering Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 3 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 4 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Sun, 24 Mar 2013, 00:53:30 EST by System User on behalf of School of Information Technol and Elec Engineering