Collection of cancer stage data by classifying free-text medical reports

McCowan, Iain A., Moore, Darren C., Nguyen, Anthony N., Bowman, Rayleen V., Clarke, Belinda E., Duhig, Edwina E. and Fry, Mary-Jane (2007) Collection of cancer stage data by classifying free-text medical reports. Journal of The American Medical Informatics Association, 14 6: 736-745. doi:10.1197/jamia.M2130

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads

Author McCowan, Iain A.
Moore, Darren C.
Nguyen, Anthony N.
Bowman, Rayleen V.
Clarke, Belinda E.
Duhig, Edwina E.
Fry, Mary-Jane
Title Collection of cancer stage data by classifying free-text medical reports
Journal name Journal of The American Medical Informatics Association   Check publisher's open access policy
ISSN 1067-5027
Publication date 2007-11
Sub-type Article (original research)
DOI 10.1197/jamia.M2130
Volume 14
Issue 6
Start page 736
End page 745
Total pages 10
Place of publication London, United Kingdom
Publisher B M J Group
Language eng
Subject 110203 Respiratory Diseases
Abstract Cancer staging provides a basis for planning clinical management, but also allows for meaningful analysis of cancer outcomes and evaluation of cancer care services. Despite this, stage data in cancer registries is often incomplete, inaccurate, or simply not collected. This article describes a prototype software system (Cancer Stage Interpretation System, CSIS) that automatically extracts cancer staging information from medical reports. The system uses text classification techniques to train support vector machines (SVMs) to extract elements of stage listed in cancer staging guidelines. When processing new reports, CSIS identifies sentences relevant to the staging decision, and subsequently assigns the most likely stage. The system was developed using a database of staging data and pathology reports for 710 lung cancer patients, then validated in an independent set of 179 patients against pathologic stage assigned by two independent pathologists. CSIS achieved overall accuracy of 74% for tumor (T) staging and 87% for node (N) staging, and errors were observed to mirror disagreements between human experts.
Keyword Medical Informatics
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: Excellence in Research Australia (ERA) - Collection
ERA 2012 Admin Only
School of Medicine Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 20 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 25 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Mon, 18 Feb 2008, 15:49:49 EST