With the ever-increasing incidence of chronic disease cases around the world due to modern lifestyles, an urgent need has arisen to develop a healthcare recommender system for Chronic Disease Diagnosis (CDD). Such system can play a major role in controlling the disease through providing justifiable, real-time, accurate and trustworthy disease risk diagnosis prediction and medical advices recommendations. These recommendations are represented by types of intake food, types of exercises, treatment, healthcare information, etc. CDD recommender system is considered as extra tools that would assist physicians in controlling and managing patients’ disease, and would provide a wide range of health education that increases patients’ awareness and influences their attitude and knowledge related to the improvement of health. Providing an accurate real-time recommendation for medical data is a challenge according to the complexity of the medical data represented by unbalance, large, multi-dimension, noisy and/or missing data. The CDD system expectation is to give a high accuracy of disease risk prediction and medical advices recommendation.
In this research, a CDD recommender system approach is proposed based on a hybrid collaborative filtering method using multiple classifications and Integrated Collaborative Filtering (ICF) approaches. Multiple classifications based on decision tree algorithms are utilized, in the first stage of CDD system, to get the most accurate learning model that predicts the disease risk for the monitored cases with high accuracy. To achieve an accurate model, training the model is a key factor. Therefore, real historical medical data of previous patients is considered to train the model. Determining the relevant features through a feature selection method is considered to improve the learning model performance. Merging patients’ lab and home test readings is considered through this work to leverage the diagnosis fidelity.
The ICF approach based on clustering and classification modelling is proposed, in the second stage of CDD system, to achieve high accuracy medical advices recommendations. In ICF, we incorporate the result of a clustering algorithm such as K-means clustering into a classification engine, such as Random Forest (RF) classification, using both historical binary ratings and features. Diabetes diagnosis case study is designed as experiments to show the feasibility of our model. The diabetes dataset is collected from hospitals in Sultanate of Oman. Multiple decision tree classifiers have been applied through the first stage of CDD system. Our experiments show that CDD recommender system achieved better performance through building the ensemble trees-RF model using bootstrap samples of training data and random feature selection, considering the historical medical data of previous patients, and merging the long term laboratory tests with the daily-home tests of the monitored cases.
The significance of our research is the incorporation of the complexity of medical data and multi-dimension data into a recommender system to generate efficient and effective disease risk prediction and medical advices recommendation for chronic disease patients. Our contribution towards providing such efficient prediction and recommendation is achieved through the proposed hybrid collaborative filtering method using classifications and ICF approaches.