Login / Signup

Screening Nonlinear miRNA Features of Breast Cancer by Using Ensemble Regularized Polynomial Logistic Regression.

Juntao LiShan XiangXuekun Song
Published in: Journal of computational biology : a journal of computational molecular cell biology (2024)
Differentiating breast cancer subtypes based on miRNA data helps doctors provide more personalized treatment plans for patients. This paper explored the interaction between miRNA pairs and developed a novel ensemble regularized polynomial logistic regression method for screening nonlinear features of breast cancer. Three different types of second-order polynomial logistic regression with elastic network penalty (SOPLR-EN) in which each type contains 10 identical models were integrated to determine the most suitable sample set for feature screening by using bootstrap sampling strategy. A single feature and 39 nonlinear features were obtained by screening features that appeared at least 15 times in 30 integrations and were involved in the classification of at least 4 subtypes. The second-order polynomial logistic regression with ridge penalty (SOPLR-R) built on screened feature set achieved 82.30% classification accuracy for distinguishing breast cancer subtypes, surpassing the performance of other six methods. Further, 11 nonlinear miRNA biomarkers were identified, and their significant relevance to breast cancer was illustrated through six types of biological analysis.
Keyphrases
  • machine learning
  • deep learning
  • ejection fraction
  • newly diagnosed
  • computed tomography
  • convolutional neural network
  • young adults
  • prognostic factors
  • electronic health record
  • breast cancer risk