Login / Signup

Integrated network analysis and machine learning approach for the identification of key genes of triple-negative breast cancer.

Leimarembi Devi NaoremMathavan MuthaiyanAmouda Venkatesan
Published in: Journal of cellular biochemistry (2018)
Triple-negative breast cancer (TNBC) has attracted more attention compared with other breast cancer subtypes due to its aggressive nature, poor prognosis, and chemotherapy remains the mainstay of treatment with no other approved targeted therapy. Therefore, the study aimed to discover more promising therapeutic targets and investigating new insights of biological mechanism of TNBC. Six microarray data sets consisting of 463 non-TNBC and 405 TNBC samples were mined from Gene Expression Omnibus. The data sets were integrated by meta-analysis and identified 1075 differentially expressed genes. Protein-protein interaction network was constructed which consists of 486 nodes and 1932 edges, where 29 hub genes were obtained with high topological measures. Further, 16 features (hub genes), 12 upregulated (AURKB, CCNB2, CDC20, DDX18, EGFR, ENO1, MYC, NUP88, PLK1, PML, POLR2F, and SKP2) and four downregulated ( CCND1, GLI3, SKP1, and TGFB3) were selected through machine learning correlation based feature selection method on training data set. A naïve Bayes based classifier built using the expression profiles of 16 features (hub genes) accurately and reliably classify TNBC from non-TNBC samples in the validation test data set with a receiver operating curve of 0.93 to 0.98. Subsequently, Gene Ontology analysis revealed that the hub genes were enriched in mitotic cell cycle processes and Kyoto Encyclopedia of Genes and Genomes pathway analysis showed that they were enriched in cell cycle pathways. Thus, the identified key hub genes and pathways highlighted in the study would enhance the understanding of molecular mechanism of TNBC which may serve as potential therapeutic target.
Keyphrases