A framework for data quality aware query systems

Yeganeh, Naiem K., Sadiq, Shazia and Sharaf, Mohamed A. (2014) A framework for data quality aware query systems. Information Systems, 46 24-44. doi:10.1016/j.is.2014.05.005


Author Yeganeh, Naiem K.
Sadiq, Shazia
Sharaf, Mohamed A.
Title A framework for data quality aware query systems
Journal name Information Systems   Check publisher's open access policy
ISSN 0306-4379
1873-6076
Publication date 2014-12
Sub-type Article (original research)
DOI 10.1016/j.is.2014.05.005
Open Access Status
Volume 46
Start page 24
End page 44
Total pages 21
Place of publication Oxford, United Kingdom
Publisher Elsevier
Collection year 2015
Language eng
Formatted abstract
Highlights
• A framework for a data quality aware query system is proposed.
• User preferences on data quality are used as the basis for query answering.
• Advanced techniques to estimate data quality of query results have been developed and evaluated.
• The framework is demonstrated through a prototype implementation.

The issue of data quality is increasingly important as individuals as well as corporations are relying on multiple, often external sources of data to make decisions. Traditional query systems do not factor in data quality considerations in their response. Further, studies into the diverse interpretations of data quality indicate that fitness for use is a fundamental criterion in the evaluation of data quality. In this paper we address the issue of data quality aware query systems by developing a query answering framework that considers user data quality preferences over a collaborative information systems architecture. Our work is motivated by an extensive study of data quality literature that revealed a lack of holistic solutions that encompass both business and technological aspects of data quality management. Accordingly the developed framework for data quality aware query systems takes an end-to-end view of the problem. In this paper we have focused on three major aspects relating to quality aware query systems, namely measuring data quality, modeling of user׳s data quality preferences, and answering the query in consideration of the defined preferences and measures. We then address each of these issues by introducing data quality profiling, data quality aware SQL, and data quality aware query answering methods. Contributions of this paper have been evaluated on real and simulated data. The individual components have also been assembled into a running prototype.
Keyword Data quality
Data profiling
User preference modeling
Query systems
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2015 Collection
School of Information Technology and Electrical Engineering Publications
 
Available Versions of this Record
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 2 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 3 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Sun, 14 Sep 2014, 00:16:55 EST by Dr Mohamed Sharaf on behalf of School of Information Technol and Elec Engineering