A Supervised Approach to Quantifying Sentence Similarity: With Application to Evidence Based Medicine

Hassanzadeh, Hamed, Groza, Tudor, Nguyen, Anthony and Hunter, Jane (2015) A Supervised Approach to Quantifying Sentence Similarity: With Application to Evidence Based Medicine. PL o S One, 10 6: 1-25. doi:10.1371/journal.pone.0129392

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads

Author Hassanzadeh, Hamed
Groza, Tudor
Nguyen, Anthony
Hunter, Jane
Title A Supervised Approach to Quantifying Sentence Similarity: With Application to Evidence Based Medicine
Journal name PL o S One   Check publisher's open access policy
ISSN 1932-6203
Publication date 2015-06-01
Year available 2015
Sub-type Article (original research)
DOI 10.1371/journal.pone.0129392
Open Access Status DOI
Volume 10
Issue 6
Start page 1
End page 25
Total pages 25
Place of publication San Francisco, CA, United States
Publisher Public Library of Science
Language eng
Subject 1300 Biochemistry, Genetics and Molecular Biology
1100 Agricultural and Biological Sciences
Abstract Following the Evidence Based Medicine (EBM) practice, practitioners make use of the existing evidence to make therapeutic decisions. This evidence, in the form of scientific statements, is usually found in scholarly publications such as randomised control trials and systematic reviews. However, finding such information in the overwhelming amount of published material is particularly challenging. Approaches have been proposed to automatically extract scientific artefacts in EBM using standardised schemas. Our work takes this stream a step forward and looks into consolidating extracted artefacts - i.e., quantifying their degree of similarity based on the assumption that they carry the same rhetorical role. By semantically connecting key statements in the literature of EBM, practitioners are not only able to find available evidence more easily, but also can track the effects of different treatments/outcomes in a number of related studies. We devise a regression model based on a varied set of features and evaluate it both on a general English corpus (the SICK corpus), as well as on an EBM corpus (the NICTA-PIBOSO corpus). Experimental results show that our approach performs on par with the state of the art on the general English and achieves encouraging results on the biomedical text when compared against human judgement.
Keyword Evidence based medicine
Regression analysis
Q-Index Code C1
Q-Index Status Confirmed Code
Grant ID DE120100508
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2016 Collection
School of Information Technology and Electrical Engineering Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 1 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 1 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Sun, 05 Jul 2015, 10:31:38 EST by System User on behalf of Scholarly Communication and Digitisation Service