Improving gene set enrichment analysis (GSEA) by using regulation directionality.
Biwen WangFrans M van der KloetMariah B M J KesJoen LuirinkLeendert W HamoenPublished in: Microbiology spectrum (2024)
To infer the biological meaning from transcriptome data, it is useful to focus on genes that are regulated by the same regulator, i.e., regulons. Unfortunately, current gene set enrichment analysis (GSEA) tools do not consider whether a gene is activated or repressed by a regulator. This distinction is crucial when analyzing regulons since a regulator can work as an activator of certain genes and as a repressor of other genes, yet both sets of genes belong to the same regulon. Therefore, simply averaging expression differences of the genes of such a regulon will not properly reflect the activity of the regulator. What makes it more complicated is the fact that many genes are regulated by different transcription factors, and current transcriptome analysis tools are unable to indicate which regulator is most likely responsible for the observed expression difference of a gene. To address these challenges, we developed the gene set enrichment analysis program GINtool. Additional features of GINtool are novel graphical representations to facilitate the visualization of gene set analyses of transcriptome data, the possibility to include functional categories as gene sets for analysis, and the option to analyze expression differences within operons, which is useful when analyzing prokaryotic transcriptome and also proteome data.IMPORTANCEMeasuring the activity of all genes in cells is a common way to elucidate the function and regulation of genes. These transcriptome analyses produce large amounts of data since genomes contain thousands of genes. The analysis of these large data sets is challenging. Therefore, we developed a new software tool called GINtool that can facilitate the analysis of transcriptome data by using prior knowledge of gene sets controlled by the same regulator, the so-called regulons. An important novelty of GINtool is that it can take into account the directionality of gene regulation in these analyses, i.e., whether a gene is activated or repressed, which is crucial to assess whether a regulon or functional category is affected. GINtool also includes new graphical methods to facilitate the visual inspection of regulation events in transcriptome data sets. These and additional analysis methods included in GINtool make it a powerful software tool to analyze transcriptome data.
Keyphrases
- genome wide
- genome wide identification
- transcription factor
- dna methylation
- copy number
- electronic health record
- genome wide analysis
- big data
- single cell
- rna seq
- gene expression
- healthcare
- poor prognosis
- data analysis
- machine learning
- signaling pathway
- palliative care
- oxidative stress
- long non coding rna
- artificial intelligence