Healthcare data mining from multi-source data

Chen, Ling (2017). Healthcare data mining from multi-source data PhD Thesis, School of Information Technology and Electrical Engineering, The University of Queensland. doi:10.14264/uql.2017.365

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
s4193152_phd_finalthesis.pdf Full text (open access) application/pdf 2.37MB 0
Author Chen, Ling
Thesis Title Healthcare data mining from multi-source data
School, Centre or Institute School of Information Technology and Electrical Engineering
Institution The University of Queensland
DOI 10.14264/uql.2017.365
Publication date 2017-02-28
Thesis type PhD Thesis
Supervisor Xue Li
Mohamed A. Sharaf
Total pages 171
Language eng
Subjects 0801 Artificial Intelligence and Image Processing
0806 Information Systems
Abstract/Summary The "big data" challenge is changing the way we acquire, store, analyse, and draw conclusions from data. How we effectively and efficiently "mine" the data from possibly multiple sources and extract useful information is a critical question. Increasing research attention has been drawn to healthcare data mining, with an ultimate goal to improve the quality of care. The human body is complex and so too the data collected in treating it. Data noise that is often introduced via the collection process makes building Data Mining models a challenging task. This thesis focuses on the classification tasks of mining healthcare data, with the goal of improving the effectiveness of health risk prediction. In particular, we developed algorithms to address issues identified from real healthcare data, such as feature extraction, heterogeneity, label uncertainty, and large unlabeled data. The three main contributions of this research are as follows. First, we developed a new health index called Personal Health Index (PHI) that scores a person's health status based on the examination records of a given population. Second, we identified the key characteristics of the real datasets and issues that were associated with the data. Third, we developed classification algorithms to cope with those issues, particularly, the label uncertainty and large unlabeled data issues. This research takes one step forward towards scoring personal health based on mining increasingly large health records. Particularly, it pioneers exploring the mining of GHE data and tackles the associated challenges. It is our anticipation that in the near future, more robust data-mining-based health scoring systems will be available for healthcare professionals to understand people's health status and thus improve the quality of care.
Keyword Personal health index mining
Health examination records
Classification with label uncertainty
Classification with large unlabeled data
Graph-based semi-supervised learning

Document type: Thesis
Collections: UQ Theses (RHD) - Official
UQ Theses (RHD) - Open Access
Version Filter Type
Citation counts: Google Scholar Search Google Scholar
Created: Wed, 08 Feb 2017, 06:35:47 EST by Ling Chen on behalf of Learning and Research Services (UQ Library)