Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence.
Hela ElmannaiNora El-RashidyIbrahim MashalManal Abdullah AlohaliSara F Abd-El GhanyShaker El-SappaghHager SalehPublished in: Diagnostics (Basel, Switzerland) (2023)
Polycystic ovary syndrome (PCOS) has been classified as a severe health problem common among women globally. Early detection and treatment of PCOS reduce the possibility of long-term complications, such as increasing the chances of developing type 2 diabetes and gestational diabetes. Therefore, effective and early PCOS diagnosis will help the healthcare systems to reduce the disease's problems and complications. Machine learning (ML) and ensemble learning have recently shown promising results in medical diagnostics. The main goal of our research is to provide model explanations to ensure efficiency, effectiveness, and trust in the developed model through local and global explanations. Feature selection methods with different types of ML models (logistic regression (LR), random forest (RF), decision tree (DT), naive Bayes (NB), support vector machine (SVM), k-nearest neighbor (KNN), xgboost, and Adaboost algorithm to get optimal feature selection and best model. Stacking ML models that combine the best base ML models with meta-learner are proposed to improve performance. Bayesian optimization is used to optimize ML models. Combining SMOTE (Synthetic Minority Oversampling Techniques) and ENN (Edited Nearest Neighbour) solves the class imbalance. The experimental results were made using a benchmark PCOS dataset with two ratios splitting 70:30 and 80:20. The result showed that the Stacking ML with REF feature selection recorded the highest accuracy at 100 compared to other models.
Keyphrases
- polycystic ovary syndrome
- machine learning
- artificial intelligence
- insulin resistance
- deep learning
- healthcare
- type diabetes
- big data
- risk factors
- neural network
- mental health
- systematic review
- adipose tissue
- crispr cas
- convolutional neural network
- health information
- risk assessment
- metabolic syndrome
- smoking cessation
- glycemic control
- cardiovascular disease
- combination therapy