Login / Signup

Inferring repeat-protein energetics from evolutionary information.

Rocío EspadaR Gonzalo ParraThierry MoraAleksandra M WalczakDiego U Ferreiro
Published in: PLoS computational biology (2017)
Natural protein sequences contain a record of their history. A common constraint in a given protein family is the ability to fold to specific structures, and it has been shown possible to infer the main native ensemble by analyzing covariations in extant sequences. Still, many natural proteins that fold into the same structural topology show different stabilization energies, and these are often related to their physiological behavior. We propose a description for the energetic variation given by sequence modifications in repeat proteins, systems for which the overall problem is simplified by their inherent symmetry. We explicitly account for single amino acid and pair-wise interactions and treat higher order correlations with a single term. We show that the resulting evolutionary field can be interpreted with structural detail. We trace the variations in the energetic scores of natural proteins and relate them to their experimental characterization. The resulting energetic evolutionary field allows the prediction of the folding free energy change for several mutants, and can be used to generate synthetic sequences that are statistically indistinguishable from the natural counterparts.
Keyphrases
  • amino acid
  • genome wide
  • protein protein
  • gene expression
  • preterm infants
  • heavy metals
  • risk assessment
  • mass spectrometry
  • density functional theory
  • molecular dynamics