Improving the performance of single-cell RNA-seq data mining based on relative expression orderings.
Yuanyuan ChenHao ZhangXiao SunPublished in: Briefings in bioinformatics (2022)
The advent of single-cell RNA-sequencing (scRNA-seq) provides an unprecedented opportunity to explore gene expression profiles at the single-cell level. However, gene expression values vary over time and under different conditions even within the same cell. There is an urgent need for more stable and reliable feature variables at the single-cell level to depict cell heterogeneity. Thus, we construct a new feature matrix called the delta rank matrix (DRM) from scRNA-seq data by integrating an a priori gene interaction network, which transforms the unreliable gene expression value into a stable gene interaction/edge value on a single-cell basis. This is the first time that a gene-level feature has been transformed into an interaction/edge-level for scRNA-seq data analysis based on relative expression orderings. Experiments on various scRNA-seq datasets have demonstrated that DRM performs better than the original gene expression matrix in cell clustering, cell identification and pseudo-trajectory reconstruction. More importantly, the DRM really achieves the fusion of gene expressions and gene interactions and provides a method of measuring gene interactions at the single-cell level. Thus, the DRM can be used to find changes in gene interactions among different cell types, which may open up a new way to analyze scRNA-seq data from an interaction perspective. In addition, DRM provides a new method to construct a cell-specific network for each single cell instead of a group of cells as in traditional network construction methods. DRM's exceptional performance is due to its extraction of rich gene-association information on biological systems and stable characterization of cells.
Keyphrases
- single cell
- rna seq
- high throughput
- gene expression
- genome wide
- copy number
- genome wide identification
- data analysis
- dna methylation
- machine learning
- poor prognosis
- induced apoptosis
- healthcare
- deep learning
- cell proliferation
- long non coding rna
- big data
- artificial intelligence
- cell cycle arrest
- genome wide analysis
- mesenchymal stem cells
- pi k akt