WebPut: efficient web-based data imputation

Li, Zhixu, Sharaf, Mohamed A., Sitbon, Laurianne, Sadiq, Shazia, Indulska, Marta and Zhou, Xiaofang (2012). WebPut: efficient web-based data imputation. In: X. Sean Wang, Isabel Cruz, Alex Delis and Guangyan Huang, Web Information Systems Engineering - WISE 2012. WISE 2012 : 13th international conference, Paphos, Cyprus, (243-256). 28-30 November 2012. doi:10.1007/978-3-642-35063-4_18


Author Li, Zhixu
Sharaf, Mohamed A.
Sitbon, Laurianne
Sadiq, Shazia
Indulska, Marta
Zhou, Xiaofang
Title of paper WebPut: efficient web-based data imputation
Conference name WISE 2012 : 13th international conference
Conference location Paphos, Cyprus
Conference dates 28-30 November 2012
Proceedings title Web Information Systems Engineering - WISE 2012   Check publisher's open access policy
Journal name Lecture Notes in Computer Science   Check publisher's open access policy
Place of Publication Heidelberg, Germany
Publisher Springer
Publication Year 2012
Sub-type Fully published paper
DOI 10.1007/978-3-642-35063-4_18
ISBN 9783642350634
3642350631
ISSN 0302-9743
0302-9743
Editor X. Sean Wang
Isabel Cruz
Alex Delis
Guangyan Huang
Volume 7651
Start page 243
End page 256
Total pages 14
Collection year 2013
Language eng
Formatted Abstract/Summary
In this paper, we present WebPut, a prototype system that adopts a novel web-based approach to the data imputation problem. Towards this, Webput utilizes the available information in an incomplete database in conjunction with the data consistency principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme that efficiently leverages our suite of data imputation queries to automatically select the most effective imputation query for each missing value. A greedy iterative algorithm is also proposed to schedule the imputation order of the different missing values in a database, and in turn the issuing of their corresponding imputation queries, for improving the accuracy and efficiency of WebPut. Experiments based on several real-world data collections demonstrate that WebPut outperforms existing approaches
Keyword Incomplete Data
Web based Data Imputation
WebPut
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

 
Versions
Version Filter Type
Citation counts: Scopus Citation Count Cited 0 times in Scopus Article
Google Scholar Search Google Scholar
Created: Mon, 04 Mar 2013, 15:37:14 EST by Marta Indulska on behalf of UQ Business School