Login / Signup

Improved prediction of site-rates from structure with averaging across homologs.

Christoffer NornFábio OliveiraIngemar Andre
Published in: Protein science : a publication of the Protein Society (2024)
Variation in mutation rates at sites in proteins can largely be understood by the constraint that proteins must fold into stable structures. Models that calculate site-specific rates based on protein structure and a thermodynamic stability model have shown a significant but modest ability to predict empirical site-specific rates calculated from sequence. Models that use detailed atomistic models of protein energetics do not outperform simpler approaches using packing density. We demonstrate that a fundamental reason for this is that empirical site-specific rates are the result of the average effect of many different microenvironments in a phylogeny. By analyzing the results of evolutionary dynamics simulations, we show how averaging site-specific rates across many extant protein structures can lead to correct recovery of site-rate prediction. This result is also demonstrated in natural protein sequences and experimental structures. Using predicted structures, we demonstrate that atomistic models can improve upon contact density metrics in predicting site-specific rates from a structure. The results give fundamental insights into the factors governing the distribution of site-specific rates in protein families.
Keyphrases
  • high resolution
  • protein protein
  • amino acid
  • binding protein
  • gene expression
  • mass spectrometry
  • small molecule
  • dna methylation
  • genome wide
  • aqueous solution
  • monte carlo