A pattern-based framework for addressing data representational inconsistency

Yi, Bingyu, Hua, Wen and Sadiq, Shazia (2016). A pattern-based framework for addressing data representational inconsistency. In: Muhammad Aamir Cheema, Wenjie Zhang and Lijun Chang, Databases Theory and Applications - 27th Australasian Database Conference, ADC 2016, Proceedings. 27th Australasian Database Conference on Databases Theory and Applications, ADC 2016, Sydney, NSW, Australia, (395-406). 28-29 September 2016. doi:10.1007/978-3-319-46922-5_31


Author Yi, Bingyu
Hua, Wen
Sadiq, Shazia
Title of paper A pattern-based framework for addressing data representational inconsistency
Conference name 27th Australasian Database Conference on Databases Theory and Applications, ADC 2016
Conference location Sydney, NSW, Australia
Conference dates 28-29 September 2016
Proceedings title Databases Theory and Applications - 27th Australasian Database Conference, ADC 2016, Proceedings   Check publisher's open access policy
Journal name Lecture Notes in Computer Science    Check publisher's open access policy
Place of Publication Heidelberg, Germany
Publisher Springer
Publication Year 2016
Sub-type Fully published paper
DOI 10.1007/978-3-319-46922-5_31
Open Access Status Not yet assessed
ISBN 9783319469218
ISSN 1611-3349
0302-9743
Editor Muhammad Aamir Cheema
Wenjie Zhang
Lijun Chang
Volume 9877
Start page 395
End page 406
Total pages 12
Chapter number 31
Total chapters 44
Collection year 2017
Language eng
Abstract/Summary Data representational inconsistency, where data has diverse formats or structures, is a crucial data quality problem. Existing fixing approaches either target on a specific domain or require massive information from users. In this work, we propose a user-friendly pattern-based framework for addressing data representational inconsistency. Our framework consists of three modules: pattern design, pattern detection, and pattern unification. We identify several challenges in all the three tasks in order to handle an inconsistent dataset both accurately and efficiently. We propose various techniques to tackle these issues, and our experimental results on real-life datasets demonstrate better performance of our proposals compared with existing methods.
Q-Index Code E1
Q-Index Status Provisional Code
Institutional Status UQ

 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in Thomson Reuters Web of Science Article
Scopus Citation Count Cited 0 times in Scopus Article
Google Scholar Search Google Scholar
Created: Sun, 25 Dec 2016, 10:16:01 EST by System User on behalf of Learning and Research Services (UQ Library)