Hierarchical Boosting Dual-Stage Feature Reduction Ensemble Model for Parkinson's Disease Speech Data.

Mingyao YangJie MaPin WangZhi Yong HuangYongming LiHe LiuZeeshan Hameed

Published in: Diagnostics (Basel, Switzerland) (2021)

As a neurodegenerative disease, Parkinson's disease (PD) is hard to identify at the early stage, while using speech data to build a machine learning diagnosis model has proved effective in its early diagnosis. However, speech data show high degrees of redundancy, repetition, and unnecessary noise, which influence the accuracy of diagnosis results. Although feature reduction (FR) could alleviate this issue, the traditional FR is one-sided (traditional feature extraction could construct high-quality features without feature preference, while traditional feature selection could achieve feature preference but could not construct high-quality features). To address this issue, the Hierarchical Boosting Dual-Stage Feature Reduction Ensemble Model (HBD-SFREM) is proposed in this paper. The major contributions of HBD-SFREM are as follows: (1) The instance space of the deep hierarchy is built by an iterative deep extraction mechanism. (2) The manifold features extraction method embeds the nearest neighbor feature preference method to form the dual-stage feature reduction pair. (3) The dual-stage feature reduction pair is iteratively performed by the AdaBoost mechanism to obtain instances features with higher quality, thus achieving a substantial improvement in model recognition accuracy. (4) The deep hierarchy instance space is integrated into the original instance space to improve the generalization of the algorithm. Three PD speech datasets and a self-collected dataset are used to test HBD-SFREM in this paper. Compared with other FR algorithms and deep learning algorithms, the accuracy of HBD-SFREM in PD speech recognition is improved significantly and would not be affected by a small sample dataset. Thus, HBD-SFREM could give a reference for other related studies.

Keyphrases