Sequencing and characterizing short tandem repeats in the human genome.
Hope A TanudisastroIra W DevesonHarriet DashnowDaniel G MacArthurPublished in: Nature reviews. Genetics (2024)
Short tandem repeats (STRs) are highly polymorphic sequences throughout the human genome that are composed of repeated copies of a 1-6-bp motif. Over 1 million variable STR loci are known, some of which regulate gene expression and influence complex traits, such as height. Moreover, variants in at least 60 STR loci cause genetic disorders, including Huntington disease and fragile X syndrome. Accurately identifying and genotyping STR variants is challenging, in particular mapping short reads to repetitive regions and inferring expanded repeat lengths. Recent advances in sequencing technology and computational tools for STR genotyping from sequencing data promise to help overcome this challenge and solve genetically unresolved cases and the 'missing heritability' of polygenic traits. Here, we compare STR genotyping methods, analytical tools and their applications to understand the effect of STR variation on health and disease. We identify emergent opportunities to refine genotyping and quality-control approaches as well as to integrate STRs into variant-calling workflows and large cohort analyses.
Keyphrases
- genome wide
- dna methylation
- copy number
- gene expression
- endothelial cells
- quality control
- healthcare
- public health
- high throughput
- big data
- induced pluripotent stem cells
- high resolution
- body mass index
- mental health
- pluripotent stem cells
- high frequency
- machine learning
- climate change
- electronic health record
- deep learning
- health promotion
- mass spectrometry