An NMF-Based Methodology for Selecting Biomarkers in the Landscape of Genes of Heterogeneous Cancer-Associated Fibroblast Populations.
Flavia EspositoAngelina BoccarelliNicoletta Del BuonoPublished in: Bioinformatics and biology insights (2020)
The rapid development of high-performance technologies has greatly promoted studies of molecular oncology producing large amounts of data. Even if these data are publicly available, they need to be processed and studied to extract information useful to better understand mechanisms of pathogenesis of complex diseases, such as tumors. In this article, we illustrated a procedure for mining biologically meaningful biomarkers from microarray datasets of different tumor histotypes. The proposed methodology allows to automatically identify a subset of potentially informative genes from microarray data matrices, which differs either in the number of rows (genes) and of columns (patients). The methodology integrates nonnegative matrix factorization method, a functional enrichment analysis web tool with a properly designed gene extraction procedure to allow the analysis of omics input data with different row size. The proposed methodology has been used to mine microarray of solid tumors of different embryonic origin to verify the presence of common genes characterizing the heterogeneity of cancer-associated fibroblasts. These automatically extracted biomarkers could be used to suggest appropriate therapies to inactivate the state of active fibroblasts, thus avoiding their action on tumor progression.
Keyphrases
- bioinformatics analysis
- genome wide
- electronic health record
- genome wide identification
- big data
- single cell
- end stage renal disease
- newly diagnosed
- ejection fraction
- data analysis
- oxidative stress
- prognostic factors
- palliative care
- transcription factor
- minimally invasive
- healthcare
- poor prognosis
- machine learning
- peritoneal dialysis
- patient reported outcomes
- long non coding rna
- artificial intelligence
- liquid chromatography