Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations-Application to HIV-1 quasispecies.
Dylan H WestfallWenjie DengAlec PankowHugh MurrellLennie ChenHong ZhaoCarolyn WilliamsonMorgane RollandBen MurrellJames I MullinsPublished in: Virus evolution (2024)
Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing, which can require extensive optimizations to overcome. We present complete laboratory and bioinformatics workflows to overcome many of these hurdles. The Pacific Biosciences single molecule real-time platform was used to sequence polymerase-chain reaction (PCR) amplicons derived from cDNA templates tagged with unique molecular identifiers (SMRT-UMI). Optimized laboratory protocols were developed through extensive testing of different sample preparation conditions to minimize between-template recombination during PCR. The use of UMI allowed accurate template quantitation as well as removal of point mutations introduced during PCR and sequencing to produce a highly accurate consensus sequence from each template. Production of highly accurate sequences from the large datasets produced from SMRT-UMI sequencing is facilitated by a novel bioinformatic pipeline, Probabilistic Offspring Resolver for Primer IDs (PORPIDpipeline). PORPIDpipeline automatically filters and parses circular consensus reads by sample, identifies and discards reads with UMIs likely created from PCR and sequencing errors, generates consensus sequences, checks for contamination within the dataset, and removes any sequence with evidence of PCR recombination, heteroduplex formation, or early cycle PCR errors. The optimized SMRT-UMI sequencing and PORPIDpipeline methods presented here represent a highly adaptable and established starting point for accurate sequencing of diverse pathogens. These methods are illustrated through characterization of human immunodeficiency virus quasispecies in a virus transmitter-recipient pair of individuals.
Keyphrases
- human immunodeficiency virus
- single cell
- single molecule
- high resolution
- antiretroviral therapy
- hepatitis c virus
- rna seq
- hiv infected
- patient safety
- hiv positive
- molecularly imprinted
- clinical practice
- mass spectrometry
- dna repair
- high throughput
- gene expression
- hiv aids
- hiv testing
- oxidative stress
- liquid chromatography
- living cells
- high performance liquid chromatography
- gram negative
- men who have sex with men
- fluorescent probe