Explainable machine learning for knee osteoarthritis diagnosis based on a novel fuzzy feature selection methodology.
Christos KokkotisCharis NtakoliaSerafeim MoustakidisGiannis GiakasDimitrios TsaopoulosPublished in: Physical and engineering sciences in medicine (2022)
Knee Osteoarthritis (ΚΟΑ) is a degenerative joint disease of the knee that results from the progressive loss of cartilage. Due to KOA's multifactorial nature and the poor understanding of its pathophysiology, there is a need for reliable tools that will reduce diagnostic errors made by clinicians. The existence of public databases has facilitated the advent of advanced analytics in KOA research however the heterogeneity of the available data along with the observed high feature dimensionality make this diagnosis task difficult. The objective of the present study is to provide a robust Feature Selection (FS) methodology that could: (i) handle the multidimensional nature of the available datasets and (ii) alleviate the defectiveness of existing feature selection techniques towards the identification of important risk factors which contribute to KOA diagnosis. For this aim, we used multidimensional data obtained from the Osteoarthritis Initiative database for individuals without or with KOA. The proposed fuzzy ensemble feature selection methodology aggregates the results of several FS algorithms (filter, wrapper and embedded ones) based on fuzzy logic. The effectiveness of the proposed methodology was evaluated using an extensive experimental setup that involved multiple competing FS algorithms and several well-known ML models. A 73.55% classification accuracy was achieved by the best performing model (Random Forest classifier) on a group of twenty-one selected risk factors. Explainability analysis was finally performed to quantify the impact of the selected features on the model's output thus enhancing our understanding of the rationale behind the decision-making mechanism of the best model.
Keyphrases
- machine learning
- knee osteoarthritis
- big data
- deep learning
- risk factors
- artificial intelligence
- neural network
- decision making
- electronic health record
- healthcare
- randomized controlled trial
- multiple sclerosis
- systematic review
- mental health
- climate change
- patient safety
- clinical trial
- adverse drug
- rheumatoid arthritis
- palliative care
- quality improvement
- extracellular matrix