Login / Signup

GeneSPIDER2: large scale GRN simulation and benchmarking with perturbed single-cell data.

Mateusz GarbulowskiThomas HillertonDaniel C MorganDeniz SeçilmişLisbet SonnhammerAndreas TjärnbergTorbjörn E M NordlingErik L L Sonnhammer
Published in: NAR genomics and bioinformatics (2024)
Single-cell data is increasingly used for gene regulatory network (GRN) inference, and benchmarks for this have been developed based on simulated data. However, existing single-cell simulators cannot model the effects of gene perturbations. A further challenge lies in generating large-scale GRNs that often struggle with computational and stability issues. We present GeneSPIDER2, an update of the GeneSPIDER MATLAB toolbox for GRN benchmarking, inference, and analysis. Several software modules have improved capabilities and performance, and new functionalities have been added. A major improvement is the ability to generate large GRNs with biologically realistic topological properties in terms of scale-free degree distribution and modularity. Another major addition is a simulation of single-cell data, which is becoming increasingly popular as input for GRN inference. Specifically, we introduced the unique feature to generate single-cell data based on genetic perturbations. Finally, the simulated single-cell data was compared to real single-cell Perturb-seq data from two cell lines, showing that the synthetic and real data exhibit similar properties.
Keyphrases
  • single cell
  • rna seq
  • electronic health record
  • high throughput
  • big data
  • data analysis
  • genome wide
  • dna methylation
  • copy number
  • neural network
  • genome wide analysis