Automated discovery of multi-faceted ontologies for accurate query answering and future semantic reasoning

Gollapalli, Mohammed, Li, Xue and Wood, Ian (2013) Automated discovery of multi-faceted ontologies for accurate query answering and future semantic reasoning. Data and Knowledge Engineering, 87 405-424. doi:10.1016/j.datak.2013.05.005

Author Gollapalli, Mohammed
Li, Xue
Wood, Ian
Title Automated discovery of multi-faceted ontologies for accurate query answering and future semantic reasoning
Journal name Data and Knowledge Engineering   Check publisher's open access policy
ISSN 0169-023X
Publication date 2013-01-01
Year available 2013
Sub-type Article (original research)
DOI 10.1016/j.datak.2013.05.005
Open Access Status
Volume 87
Start page 405
End page 424
Total pages 20
Place of publication Amsterdam, The Netherlands
Publisher Elsevier
Language eng
Abstract There has been a surge of interest in the development of probabilistic techniques to discover meaningful data facts across multiple datasets provided by different organizations. The key aim is to approximate the structure and content of the induced data into a concise synopsis in order to extract meaningful data facts. Performing sensible queries across unrelated datasets is a complex task that requires a complete understanding of each contributing database's schema to define the structure of its information. Alternative approaches that use data modeling enterprise tools have been proposed, in order to give users without complex schema knowledge the ability to query databases. Unfortunately, data modeling-based matching is a content-based technique and incurs significant query evaluation costs, due to attribute level pairwise comparisons. We propose a multi-faceted classification technique for performing structural analysis on knowledge domain clusters, using a novel Ontology Guided Data Linkage (OGDL) framework. This framework supports self-organization of contributing databases through the discovery of structural dependencies, by performing multi-level exploitation of ontological domain knowledge relating to tables, attributes and tuples. The framework thus automates the discovery of schema structures across unrelated databases, based on the use of direct and weighted correlations between different ontological concepts, using a h-gram (hash gram) record matching technique for concept clustering and cluster mapping. We demonstrate the feasibility of our OGDL algorithms through a set of accuracy, performance and scalability experimental tests run on real-world datasets, and show that our system runs in polynomial time and performs well in practice. To the best of our knowledge, this is the first attempt initiated to solve data linkage problems using a multi-faceted cluster mapping strategy, and we believe that our approach presents a significant advancement towards accurate query answering and future real-time online semantic reasoning capacity.
Keyword Concept modeling
Data and knowledge visualization
Query processing
Semantic reasoning
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: School of Mathematics and Physics
Official 2014 Collection
School of Information Technology and Electrical Engineering Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 3 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 3 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Fri, 29 Nov 2013, 06:10:40 EST by System User on behalf of School of Information Technol and Elec Engineering