Pima Indians diabetes mellitus classification based on machine learning (ML) algorithms.
Victor ChangJozeene BaileyQianwen Ariel XuZhili SunPublished in: Neural computing & applications (2022)
This paper proposes an e-diagnosis system based on machine learning (ML) algorithms to be implemented on the Internet of Medical Things (IoMT) environment, particularly for diagnosing diabetes mellitus (type 2 diabetes). However, the ML applications tend to be mistrusted because of their inability to show the internal decision-making process, resulting in slow uptake by end-users within certain healthcare sectors. This research delineates the use of three interpretable supervised ML models: Naïve Bayes classifier, random forest classifier, and J48 decision tree models to be trained and tested using the Pima Indians diabetes dataset in R programming language. The performance of each algorithm is analyzed to determine the one with the best accuracy, precision, sensitivity, and specificity. An assessment of the decision process is also made to improve the model. It can be concluded that a Naïve Bayes model works well with a more fine-tuned selection of features for binary classification, while random forest works better with more features.