In vivo functional phenotypes from a computational epistatic model of evolution.
Sophia AlvarezCharisse Michelle NarteyNicholas MercadoJose Alberto de la PazTea HuseinbegovicFaruck MorcosPublished in: Proceedings of the National Academy of Sciences of the United States of America (2024)
Computational models of evolution are valuable for understanding the dynamics of sequence variation, to infer phylogenetic relationships or potential evolutionary pathways and for biomedical and industrial applications. Despite these benefits, few have validated their propensities to generate outputs with in vivo functionality, which would enhance their value as accurate and interpretable evolutionary algorithms. We demonstrate the power of epistasis inferred from natural protein families to evolve sequence variants in an algorithm we developed called sequence evolution with epistatic contributions (SEEC). Utilizing the Hamiltonian of the joint probability of sequences in the family as fitness metric, we sampled and experimentally tested for in vivo [Formula: see text]-lactamase activity in Escherichia coli TEM-1 variants. These evolved proteins can have dozens of mutations dispersed across the structure while preserving sites essential for both catalysis and interactions. Remarkably, these variants retain family-like functionality while being more active than their wild-type predecessor. We found that depending on the inference method used to generate the epistatic constraints, different parameters simulate diverse selection strengths. Under weaker selection, local Hamiltonian fluctuations reliably predict relative changes to variant fitness, recapitulating neutral evolution. SEEC has the potential to explore the dynamics of neofunctionalization, characterize viral fitness landscapes, and facilitate vaccine development.
Keyphrases
- escherichia coli
- copy number
- physical activity
- body composition
- machine learning
- wild type
- amino acid
- genome wide
- klebsiella pneumoniae
- sars cov
- single cell
- gene expression
- binding protein
- multidrug resistant
- staphylococcus aureus
- dna methylation
- human milk
- cystic fibrosis
- preterm infants
- protein protein
- biofilm formation
- neural network