Beaconet: A Reference-Free Method for Integrating Multiple Batches of Single-Cell Transcriptomic Data in Original Molecular Space.
Han XuYusen YeRan DuanYong GaoYuxuan HuLin GaoPublished in: Advanced science (Weinheim, Baden-Wurttemberg, Germany) (2024)
Integrating multiple single-cell datasets is essential for the comprehensive understanding of cell heterogeneity. Batch effect is the undesired systematic variations among technologies or experimental laboratories that distort biological signals and hinder the integration of single-cell datasets. However, existing methods typically rely on a selected dataset as a reference, leading to inconsistent integration performance using different references, or embed cells into uninterpretable low-dimensional feature space. To overcome these limitations, a reference-free method, Beaconet, for integrating multiple single-cell transcriptomic datasets in original molecular space by aligning the global distribution of each batch using an adversarial correction network is presented. Through extensive comparisons with 13 state-of-the-art methods, it is demonstrated that Beaconet can effectively remove batch effect while preserving biological variations and is superior to existing unsupervised methods using all possible references in overall performance. Furthermore, Beaconet performs integration in the original molecular feature space, enabling the characterization of cell types and downstream differential expression analysis directly using integrated data with gene-expression features. Additionally, when applying to large-scale atlas data integration, Beaconet shows notable advantages in both time- and space-efficiencies. In summary, Beaconet serves as an effective and efficient batch effect removal tool that can facilitate the integration of single-cell datasets in a reference-free and molecular feature-preserved mode.
Keyphrases
- single cell
- rna seq
- high throughput
- machine learning
- gene expression
- electronic health record
- big data
- deep learning
- single molecule
- dna methylation
- stem cells
- induced apoptosis
- anaerobic digestion
- cell proliferation
- bone marrow
- mesenchymal stem cells
- cell therapy
- oxidative stress
- endoplasmic reticulum stress
- neural network