Development of Gene Expression-Based Random Forest Model for Predicting Neoadjuvant Chemotherapy Response in Triple-Negative Breast Cancer.
Seongyong ParkGwan-Su YiPublished in: Cancers (2022)
Neoadjuvant chemotherapy (NAC) response is an important indicator of patient survival in triple negative breast cancer (TNBC), but predicting chemosensitivity remains a challenge in clinical practice. We developed an 86-gene-based random forest (RF) classifier capable of predicting neoadjuvant chemotherapy response (pathological Complete Response (pCR) or Residual Disease (RD)) in TNBC patients. The performance of pCR classification of the proposed model was evaluated by Receiver Operating Characteristic (ROC) curve and Precision Recall (PR) curve. The AUROC and AUPRC of the proposed model on the test set were 0.891 and 0.829, respectively. At a predefined specificity (>90%), the proposed model shows a superior sensitivity compared to the best performing reported NAC response prediction model (69.2% vs. 36.9%). Moreover, the predicted pCR status by the model well explains the distance recurrence free survival (DRFS) of TNBC patients. In addition, the pCR probabilities of the proposed model using the expression profiles of the CCLE TNBC cell lines show a high Spearman rank correlation with cyclophosphamide sensitivity in the TNBC cell lines (SRCC =0.697, p -value =0.031). Associations between the 86 genes and DNA repair/cell cycle mechanisms were provided through function enrichment analysis. Our study suggests that the random forest-based prediction model provides a reliable prediction of the clinical response to neoadjuvant chemotherapy and may explain chemosensitivity in TNBC.
Keyphrases
- rectal cancer
- locally advanced
- neoadjuvant chemotherapy
- sentinel lymph node
- gene expression
- cell cycle
- dna repair
- ejection fraction
- lymph node
- climate change
- end stage renal disease
- clinical practice
- machine learning
- dna damage
- squamous cell carcinoma
- radiation therapy
- genome wide
- high dose
- copy number
- real time pcr
- structural basis