scGMAAE: Gaussian mixture adversarial autoencoders for diversification analysis of scRNA-seq data.
Hai-Yun WangJian-Ping ZhaoChun-Hou ZhengYan-Sen SuPublished in: Briefings in bioinformatics (2023)
The progress of single-cell RNA sequencing (scRNA-seq) has led to a large number of scRNA-seq data, which are widely used in biomedical research. The noise in the raw data and tens of thousands of genes pose a challenge to capture the real structure and effective information of scRNA-seq data. Most of the existing single-cell analysis methods assume that the low-dimensional embedding of the raw data belongs to a Gaussian distribution or a low-dimensional nonlinear space without any prior information, which limits the flexibility and controllability of the model to a great extent. In addition, many existing methods need high computational cost, which makes them difficult to be used to deal with large-scale datasets. Here, we design and develop a depth generation model named Gaussian mixture adversarial autoencoders (scGMAAE), assuming that the low-dimensional embedding of different types of cells follows different Gaussian distributions, integrating Bayesian variational inference and adversarial training, as to give the interpretable latent representation of complex data and discover the statistical distribution of different types of cells. The scGMAAE is provided with good controllability, interpretability and scalability. Therefore, it can process large-scale datasets in a short time and give competitive results. scGMAAE outperforms existing methods in several ways, including dimensionality reduction visualization, cell clustering, differential expression analysis and batch effect removal. Importantly, compared with most deep learning methods, scGMAAE requires less iterations to generate the best results.
Keyphrases
- single cell
- rna seq
- electronic health record
- big data
- high throughput
- deep learning
- genome wide
- induced apoptosis
- dna methylation
- machine learning
- mesenchymal stem cells
- artificial intelligence
- cell cycle arrest
- cell death
- healthcare
- data analysis
- bone marrow
- health information
- optical coherence tomography
- social media
- convolutional neural network
- cell therapy
- anaerobic digestion