Login / Signup

Highly complete long-read genomes reveal pangenomic variation underlying yeast phenotypic diversity.

Cory A WellerIlya AndreevMichael J ChambersMorgan Parknull nullJoshua S BloomMeru J Sadhu
Published in: Genome research (2023)
Understanding the genetic causes of trait variation is a primary goal of genetic research. One way that individuals can vary genetically is through variable pangenomic genes - genes that are only present in some individuals in a population. The presence or absence of entire genes could have large effects on trait variation. However, variable pangenomic genes can be missed in standard genotyping workflows, due to reliance on aligning short-read sequencing to reference genomes. A popular method for studying the genetic basis of trait variation is linkage mapping, which identifies quantitative trait loci (QTLs), regions of the genome that harbor causative genetic variants. Large-scale linkage mapping in the budding yeast Saccharomyces cerevisiae has found thousands of QTLs affecting myriad yeast phenotypes. To enable the resolution of QTLs caused by variable pangenomic genes, we used long-read sequencing to generate highly complete de novo assemblies of 16 diverse yeast isolates. With these assemblies we resolved QTLs for growth on maltose, sucrose, raffinose, and oxidative stress to specific genes that are absent from the reference genome but present in the broader yeast population at appreciable frequency. Copies of genes also duplicate onto chromosomes where they are absent in the reference genome, and we found that these copies generate additional QTLs whose resolution requires pangenome characterization. Our findings demonstrate the need for highly complete genome assemblies to identify the genetic basis of trait variation.
Keyphrases
  • genome wide
  • dna methylation
  • saccharomyces cerevisiae
  • copy number
  • oxidative stress
  • gene expression
  • high resolution
  • dna damage
  • human immunodeficiency virus
  • mass spectrometry