Prediction of a Multi-Gene Assay (Oncotype DX and Mammaprint) Recurrence Risk Group Using Machine Learning in Estrogen Receptor-Positive, HER2-Negative Breast Cancer-The BRAIN Study.
Jung Hwan JiSung-Gwe AhnYoungbum YooShin-Young ParkJoo-Heung KimJi-Yeong JeongSeho ParkIlkyun LeePublished in: Cancers (2024)
This study aimed to develop a machine learning-based prediction model for predicting multi-gene assay (MGA) risk categories. Patients with estrogen receptor-positive (ER+)/HER2- breast cancer who had undergone Oncotype DX (ODX) or MammaPrint (MMP) were used to develop the prediction model. The development cohort consisted of a total of 2565 patients including 2039 patients tested with ODX and 526 patients tested with MMP. The MMP risk prediction model utilized a single XGBoost model, and the ODX risk prediction model utilized combined LightGBM, CatBoost, and XGBoost models through soft voting. Additionally, the ensemble (MMP + ODX) model combining MMP and ODX utilized CatBoost and XGBoost through soft voting. Ten random samples, corresponding to 10% of the modeling dataset, were extracted, and cross-validation was performed to evaluate the accuracy on each validation set. The accuracy of our predictive models was 84.8% for MMP, 87.9% for ODX, and 86.8% for the ensemble model. In the ensemble cohort, the sensitivity, specificity, and precision for predicting the low-risk category were 0.91, 0.66, and 0.92, respectively. The prediction accuracy exceeded 90% in several subgroups, with the highest prediction accuracy of 95.7% in the subgroup that met Ki-67 <20 and HG 1~2 and premenopausal status. Our machine learning-based predictive model has the potential to complement existing MGAs in ER+/HER2- breast cancer.
Keyphrases
- estrogen receptor
- machine learning
- end stage renal disease
- ejection fraction
- chronic kidney disease
- newly diagnosed
- clinical trial
- prognostic factors
- cell migration
- gene expression
- high throughput
- copy number
- squamous cell carcinoma
- big data
- resting state
- lymph node
- climate change
- single cell
- patient reported
- rectal cancer
- fluorescent probe
- locally advanced
- living cells