Login / Signup

Differential retention of Pfam domains contributes to long-term evolutionary trends.

Jennifer E JamesPaul NelsonJoanna Masel
Published in: Molecular biology and evolution (2023)
Protein domains that emerged more recently in evolution have higher structural disorder and greater clustering of hydrophobic residues along the primary sequence. It is hard to explain how selection acting via descent with modification could act so slowly as not to saturate over the extraordinarily long timescales over which these trends persist. Here we hypothesize that the trends were created by a higher level of selection that differentially affects the retention probabilities of protein domains with different properties. This hypothesis predicts that loss rates should depend on disorder and clustering trait values. To test this, we inferred loss rates via maximum likelihood for animal Pfam domains, after first performing a set of stringent quality control methods to reduce annotation errors. Intermediate trait values, matching those of ancient domains, are associated with the lowest loss rates, making our results difficult to explain with reference to previously described homology detection biases. Simulations confirm that effect sizes are of the right magnitude to produce the observed long-term trends. Our results support the hypothesis that differential domain loss slowly weeds out those protein domains that have non-optimal levels of disorder and clustering. The same preferences also shape differential diversification of Pfam domains, further impacting proteome composition.
Keyphrases
  • quality control
  • rna seq
  • single cell
  • genome wide
  • amino acid
  • protein protein
  • gene expression
  • emergency department
  • molecular dynamics
  • ionic liquid
  • electronic health record
  • sensitive detection
  • label free