Identifying scientific artefacts in biomedical literature: the Evidence Based Medicine use case

Hassanzadeh, Hamed, Groza, Tudor and Hunter, Jane (2014) Identifying scientific artefacts in biomedical literature: the Evidence Based Medicine use case. Journal of Biomedical Informatics, 49 159-170. doi:10.1016/j.jbi.2014.02.006

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads

Author Hassanzadeh, Hamed
Groza, Tudor
Hunter, Jane
Title Identifying scientific artefacts in biomedical literature: the Evidence Based Medicine use case
Journal name Journal of Biomedical Informatics   Check publisher's open access policy
ISSN 1532-0464
Publication date 2014-02-14
Sub-type Article (original research)
DOI 10.1016/j.jbi.2014.02.006
Volume 49
Start page 159
End page 170
Total pages 12
Place of publication Maryland Heights, MO, United States
Publisher Academic Press
Language eng
Formatted abstract
• Classification of sentences in Evidence Based Medicine abstracts, using a standard abstract structure.
• Supervised sentence-oriented classification using the PIBOSO scheme.
• Lexical, statistical and sequential features, independent of external sources.
• Increased efficiency of around 25 percentage points in F-score when compared to state of the art.

Evidence Based Medicine (EBM) provides a framework that makes use of the current best evidence in the domain to support clinicians in the decision making process. In most cases, the underlying foundational knowledge is captured in scientific publications that detail specific clinical studies or randomised controlled trials. Over the course of the last two decades, research has been performed on modelling key aspects described within publications (e.g., aims, methods, results), to enable the successful realisation of the goals of EBM. A significant outcome of this research has been the PICO (Population/Problem–Intervention–Comparison–Outcome) structure, and its refined version PIBOSO (Population–Intervention–Background–Outcome–Study Design–Other), both of which provide a formalisation of these scientific artefacts. Subsequently, using these schemes, diverse automatic extraction techniques have been proposed to streamline the knowledge discovery and exploration process in EBM. In this paper, we present a Machine Learning approach that aims to classify sentences according to the PIBOSO scheme. We use a discriminative set of features that do not rely on any external resources to achieve results comparable to the state of the art. A corpus of 1000 structured and unstructured abstracts – i.e., the NICTA-PIBOSO corpus – is used for training and testing. Our best CRF classifier achieves a micro-average F-score of 90.74% and 87.21%, respectively, over structured and unstructured abstracts, which represents an increase of 25.48 percentage points and 26.6 percentage points in F-score when compared to the best existing approaches.
Keyword Evidence Based Medicine
Text classification
Machine learning
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ
Additional Notes Available online 14 February 2014

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2015 Collection
School of Information Technology and Electrical Engineering Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 4 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 6 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 25 Mar 2014, 23:33:05 EST by Dr Tudor Groza on behalf of School of Information Technol and Elec Engineering