GS-TCGA: Gene Set-Based Analysis of The Cancer Genome Atlas.
Tarrion BairdRahul RoychoudhuriPublished in: Journal of computational biology : a journal of computational molecular cell biology (2024)
Most tools for analyzing large gene expression datasets, including The Cancer Genome Atlas (TCGA), have focused on analyzing the expression of individual genes or inference of the abundance of specific cell types from whole transcriptome information. While these methods provide useful insights, they can overlook crucial process-based information that may enhance our understanding of cancer biology. In this study, we describe three novel tools incorporated into an online resource; gene set-based analysis of The Cancer Genome Atlas (GS-TCGA). GS-TCGA is designed to enable user-friendly exploration of TCGA data using gene set-based analysis, leveraging gene sets from the Molecular Signatures Database. GS-TCGA includes three unique tools: GS-Surv determines the association between the expression of gene sets and survival in human cancers. Co-correlative gene set enrichment analysis (CC-GSEA) utilizes interpatient heterogeneity in cancer gene expression to infer functions of specific genes based on GSEA of coregulated genes in TCGA. GS-Corr utilizes interpatient heterogeneity in cancer gene expression profiles to identify genes coregulated with the expression of specific gene sets in TCGA. Users are also able to upload custom gene sets for analysis with each tool. These tools empower researchers to perform survival analysis linked to gene set expression, explore the functional implications of gene coexpression, and identify potential gene regulatory mechanisms.
Keyphrases
- genome wide
- genome wide identification
- copy number
- gene expression
- papillary thyroid
- dna methylation
- single cell
- poor prognosis
- genome wide analysis
- squamous cell carcinoma
- transcription factor
- healthcare
- emergency department
- rna seq
- endothelial cells
- stem cells
- long non coding rna
- lymph node metastasis
- electronic health record
- machine learning
- big data
- anaerobic digestion