Truth discovery in material science databases

Belisle, Eve, Huang, Zi and Gheribi, Aimen (2015). Truth discovery in material science databases. In: Mohamed A. Sharaf, Muhammad Aamir Cheema and Jianzhong Qi, Databases Theory and Applications. 26th Australasian Database Conference (ADC), Melbourne, Australia, (269-280). 4-7 Jun 2015. doi:10.1007/978-3-319-19548-3_22

Author Belisle, Eve
Huang, Zi
Gheribi, Aimen
Title of paper Truth discovery in material science databases
Conference name 26th Australasian Database Conference (ADC)
Conference location Melbourne, Australia
Conference dates 4-7 Jun 2015
Proceedings title Databases Theory and Applications   Check publisher's open access policy
Journal name Databases Theory and Applications   Check publisher's open access policy
Series Lecture Notes in Computer Science
Place of Publication Heidelberg, Germany
Publisher Springer
Publication Year 2015
Sub-type Fully published paper
DOI 10.1007/978-3-319-19548-3_22
Open Access Status Not Open Access
ISBN 9783319195476
ISSN 0302-9743
Editor Mohamed A. Sharaf
Muhammad Aamir Cheema
Jianzhong Qi
Volume 9093
Start page 269
End page 280
Total pages 12
Language eng
Abstract/Summary Instead of performing expensive experiments, it is common in industry to make predictions of important material properties based on some existing experimental results. Databases consisting of experimental observations are widely used in the field of Material Science Engineering. However, these databases are expected to be noisy since they rely on human measurements, and also because they are an amalgamation of various independent sources (research papers). Therefore, some conflicting information can be found between various sources. In this paper, we introduce a novel truth discovery approach to reduce the amount of noise and filter the incorrect conflicting information hidden in the scientific databases. Our method ranks the multiple data sources by considering the relationships between them, i.e., the amount of conflicting information and the amount of agreement, and as well eliminates the conflicting information. The scalable Gaussian process interpolation technique (SGP) is then applied to the clean dataset to make predictions of materials property. Comprehensive performance study has been done on a real life scientific database. With our new approach, we are able to highly improve the accuracy of SGP predictions and provide a more reliable database.
Subjects 2614 Theoretical Computer Science
1700 Computer Science
Keyword Prediction
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status UQ

Document type: Conference Paper
Sub-type: Fully published paper
Collections: Official 2016 Collection
School of Information Technology and Electrical Engineering Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in Thomson Reuters Web of Science Article
Scopus Citation Count Cited 0 times in Scopus Article
Google Scholar Search Google Scholar
Created: Sun, 20 Dec 2015, 10:24:04 EST by System User on behalf of Scholarly Communication and Digitisation Service