Impact of sequencing depth and read length on single cell RNA sequencing data of T cells.
Simone RizzettoAuda A EltahlaPeijie LinRowena BullAndrew R LloydJoshua Wing-Kei HoVanessa VenturiFabio LucianiPublished in: Scientific reports (2017)
Single cell RNA sequencing (scRNA-seq) provides great potential in measuring the gene expression profiles of heterogeneous cell populations. In immunology, scRNA-seq allowed the characterisation of transcript sequence diversity of functionally relevant T cell subsets, and the identification of the full length T cell receptor (TCRαβ), which defines the specificity against cognate antigens. Several factors, e.g. RNA library capture, cell quality, and sequencing output affect the quality of scRNA-seq data. We studied the effects of read length and sequencing depth on the quality of gene expression profiles, cell type identification, and TCRαβ reconstruction, utilising 1,305 single cells from 8 publically available scRNA-seq datasets, and simulation-based analyses. Gene expression was characterised by an increased number of unique genes identified with short read lengths (<50 bp), but these featured higher technical variability compared to profiles from longer reads. Successful TCRαβ reconstruction was achieved for 6 datasets (81% - 100%) with at least 0.25 millions (PE) reads of length >50 bp, while it failed for datasets with <30 bp reads. Sufficient read length and sequencing depth can control technical noise to enable accurate identification of TCRαβ and gene expression profiles from scRNA-seq data of T cells.
Keyphrases
- single cell
- rna seq
- genome wide
- regulatory t cells
- high throughput
- gene expression
- single molecule
- genome wide identification
- bioinformatics analysis
- electronic health record
- copy number
- optical coherence tomography
- big data
- dna methylation
- dendritic cells
- quality improvement
- air pollution
- high resolution
- immune response
- peripheral blood
- bone marrow
- heat stress