DNA binding analysis of rare variants in homeodomains reveals homeodomain specificity-determining residues.
Kian Hong KockPatrick K KimesStephen S GisselbrechtSachi InukaiSabrina K PhanorJames T AndersonGayatri RamakrishnanColin H LipperDongyuan SongJesse V KurlandJulia M RogersRaehoon JeongStephen C BlacklowRafael A IrizarryMartha L BulykPublished in: Nature communications (2024)
Homeodomains (HDs) are the second largest class of DNA binding domains (DBDs) among eukaryotic sequence-specific transcription factors (TFs) and are the TF structural class with the largest number of disease-associated mutations in the Human Gene Mutation Database (HGMD). Despite numerous structural studies and large-scale analyses of HD DNA binding specificity, HD-DNA recognition is still not fully understood. Here, we analyze 92 human HD mutants, including disease-associated variants and variants of uncertain significance (VUS), for their effects on DNA binding activity. Many of the variants alter DNA binding affinity and/or specificity. Detailed biochemical analysis and structural modeling identifies 14 previously unknown specificity-determining positions, 5 of which do not contact DNA. The same missense substitution at analogous positions within different HDs often exhibits different effects on DNA binding activity. Variant effect prediction tools perform moderately well in distinguishing variants with altered DNA binding affinity, but poorly in identifying those with altered binding specificity. Our results highlight the need for biochemical assays of TF coding variants and prioritize dozens of variants for further investigations into their pathogenicity and the development of clinical diagnostics and precision therapies.
Keyphrases
- dna binding
- transcription factor
- copy number
- endothelial cells
- circulating tumor
- structural basis
- gene expression
- dna methylation
- emergency department
- high throughput
- escherichia coli
- cell free
- pseudomonas aeruginosa
- autism spectrum disorder
- staphylococcus aureus
- amino acid
- circulating tumor cells
- nucleic acid
- single cell