Login / Signup

A method for partitioning the information contained in a protein sequence between its structure and function.

Andrea PossentiMichele VendruscoloCarlo CamilloniGuido Tiana
Published in: Proteins (2019)
Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences.
Keyphrases
  • amino acid
  • protein protein
  • health information
  • binding protein
  • healthcare
  • small molecule
  • dna methylation
  • genome wide
  • molecular dynamics simulations
  • social media
  • ionic liquid