Neurogenerative Disease Diagnosis in Cepstral Domain Using MFCC with Deep Learning.

Norah Saleh Alghamdi Mohammed Zakariah Vinh Truong Hoang Mohammad Mamun Elahi

Published in: Computational and mathematical methods in medicine (2022)

Because underlying cognitive and neuromuscular activities regulate speech signals, biomarkers in the human voice can provide insight into neurological illnesses. Multiple motor and nonmotor aspects of neurologic voice disorders arise from an underlying neurologic condition such as Parkinson's disease, multiple sclerosis, myasthenia gravis, or ALS. Voice problems can be caused by disorders that affect the corticospinal system, cerebellum, basal ganglia, and upper or lower motoneurons. According to a new study, voice pathology detection technologies can successfully aid in the assessment of voice irregularities and enable the early diagnosis of voice pathology. In this paper, we offer two deep-learning-based computational models, 1-dimensional convolutional neural network (1D CNN) and 2-dimensional convolutional neural network (2D CNN), that simultaneously detect voice pathologies caused by neurological illnesses or other causes. From the German corpus Saarbruecken Voice Database (SVD), we used voice recordings of sustained vowel /a/ generated at normal pitch. The collected voice signals are padded and segmented to maintain homogeneity and increase the number of samples. Convolutional layers are applied to raw data, and MFCC features are extracted in this project. Although the 1D CNN had the maximum accuracy of 93.11% on test data, model training produced overfitting and 2D CNN, which generalized the data better and had lower train and validation loss despite having an accuracy of 84.17% on test data. Also, 2D CNN outperforms state-of-the-art studies in the field, implying that a model trained on handcrafted features is better for speech processing than a model that extracts features directly.

Keyphrases