From raw publications to linked data

Groza, Tudor, Grimnes, Gunnar AAstrand, Handschuh, Siegfried and Decker, Stefan (2013) From raw publications to linked data. Knowledge and Information Systems, 34 1: 1-21. doi:10.1007/s10115-011-0473-6


Author Groza, Tudor
Grimnes, Gunnar AAstrand
Handschuh, Siegfried
Decker, Stefan
Title From raw publications to linked data
Journal name Knowledge and Information Systems   Check publisher's open access policy
ISSN 0219-1377
0219-3116
Publication date 2013-01
Year available 2011
Sub-type Article (original research)
DOI 10.1007/s10115-011-0473-6
Volume 34
Issue 1
Start page 1
End page 21
Total pages 21
Place of publication Guildford, Surrey, U.K.
Publisher Springer
Collection year 2012
Language eng
Formatted abstract
The continuous development of the Linked Data Web depends on the advancement of the underlying extraction mechanisms. This is of particular interest for the scientific publishing domain, where currently most of the data sets are being created manually. In this article, we present a Machine Learning pipeline that enables the automatic extraction of heading metadata (i.e., title, authors, etc) from scientific publications. The experimental evaluation shows that our solution handles very well any type of publication format and improves the average extraction performance of the state of the art with around 4%, in addition to showing an increased versatility. Finally, we propose a flexible Linked Data-driven mechanism to be used both for refining and linking the automatically extracted metadata.
Keyword Metadata extraction
Support vector machines
Conditional random fields
Linked data
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ
Additional Notes Published online: 29 December 2011.

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2012 Collection
School of Information Technology and Electrical Engineering Publications
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 6 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 6 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 24 Jan 2012, 14:07:50 EST by Dr Tudor Groza on behalf of School of Information Technol and Elec Engineering