Deep Ensemble Learning Approaches in Healthcare to Enhance the Prediction and Diagnosing Performance: The Workflows, Deployments, and Surveys on the Statistical, Image-Based, and Sequential Datasets.
Duc-Khanh NguyenChung-Hsien LanChien-Lung ChanPublished in: International journal of environmental research and public health (2021)
With the development of information and technology, especially with the boom in big data, healthcare support systems are becoming much better. Patient data can be collected, retrieved, and stored in real time. These data are valuable and meaningful for monitoring, diagnosing, and further applications in data analysis and decision-making. Essentially, the data can be divided into three types, namely, statistical, image-based, and sequential data. Each type has a different method of retrieval, processing, and deployment. Additionally, the application of machine learning (ML) and deep learning (DL) in healthcare support systems is growing more rapidly than ever. Numerous high-performance architectures are proposed to optimize decision-making. As reliability and stability are the most important factors in the healthcare support system, enhancing the predicted performance and maintaining the stability of the model are always the top priority. The main idea of our study comes from ensemble techniques. Numerous studies and data science competitions show that by combining several weak models into one, ensemble models can attain outstanding performance and reliability. We propose three deep ensemble learning (DEL) approaches, each with stable and reliable performance, that are workable on the above-mentioned data types. These are deep-stacked generalization ensemble learning, gradient deep learning boosting, and deep aggregation learning. The experiment results show that our proposed approaches achieve more vigorous and reliable performance than traditional ML and DL techniques on statistical, image-based, and sequential benchmark datasets. In particular, on the Heart Disease UCI dataset, representing the statistical type, the gradient deep learning boosting approach dominates the others with accuracy, recall, F1-score, Matthews correlation coefficient, and area under the curve values of 0.87, 0.81, 0.83, 0.73, and 0.91, respectively. On the X-ray dataset, representing the image-based type, the deep aggregation learning approach shows the highest performance with values of 0.91, 0.97, 0.93, 0.80, and 0.94, respectively. On the Depresjon dataset, representing the sequence type, the deep-stacked generalization ensemble learning approach outperforms the others with values of 0.91, 0.84, 0.86, 0.8, and 0.94, respectively. Overall, we conclude that applying DL models using our proposed approaches is a promising method for the healthcare support system to enhance prediction and diagnosis performance. Furthermore, our study reveals that these approaches are flexible and easy to apply to achieve optimal performance.
Keyphrases
- deep learning
- big data
- healthcare
- machine learning
- convolutional neural network
- artificial intelligence
- data analysis
- electronic health record
- decision making
- neural network
- high resolution
- public health
- computed tomography
- health information
- case report
- physical activity
- pulmonary hypertension
- single cell
- room temperature