Blended Genome Exome (BGE) as a Cost Efficient Alternative to Deep Whole Genomes or Arrays.
Matthew DeFeliceJonna L GrimsbyDaniel P HowriganKai YuanSinéad B ChapmanChristine R StevensSamuel DeLucaMegan TownsendJoseph D BuxbaumMargaret Pericak-VanceShengying QinDan J SteinSolomon TeferraRamnik J XavierHailiang HuangAlicia R MartinBenjamin M NealePublished in: bioRxiv : the preprint server for biology (2024)
Genomic scientists have long been promised cheaper DNA sequencing, but deep whole genomes are still costly, especially when considered for large cohorts in population-level studies. More affordable options include microarrays + imputation, whole exome sequencing (WES), or low-pass whole genome sequencing (WGS) + imputation. WES + array + imputation has recently been shown to yield 99% of association signals detected by WGS. However, a method free from ascertainment biases of arrays or the need for merging different data types that still benefits from deeper exome coverage to enhance novel coding variant detection does not exist. We developed a new, combined, "Blended Genome Exome" (BGE) in which a whole genome library is generated, an aliquot of that genome is amplified by PCR, the exome regions are selected and enriched, and the genome and exome libraries are combined back into a single tube for sequencing (33% exome, 67% genome). This creates a single CRAM with a low-coverage whole genome (2-3x) combined with a higher coverage exome (30-40x). This BGE can be used for imputing common variants throughout the genome as well as for calling rare coding variants. We tested this new method and observed >99% r 2 concordance between imputed BGE data and existing 30x WGS data for exome and genome variants. BGE can serve as a useful and cost-efficient alternative sequencing product for genomic researchers, requiring ten-fold less sequencing compared to 30x WGS without the need for complicated harmonization of array and sequencing data.