Genetic Algorithm for Automated Parameterization of Network Hamiltonian Models of Amyloid Fibril Formation.
Gianmarc GrazioliAndy TaoInika BhatiaPatrick ReganPublished in: The journal of physical chemistry. B (2024)
The time scales of long-time atomistic molecular dynamics simulations are typically reported in microseconds, while the time scales for experiments studying the kinetics of amyloid fibril formation are typically reported in minutes or hours. This time scale deficit of roughly 9 orders of magnitude presents a major challenge in the design of computer simulation methods for studying protein aggregation events. Coarse-grained molecular simulations offer a computationally tractable path forward for exploring the molecular mechanism driving the formation of these structures, which are implicated in diseases such as Alzheimer's, Parkinson's, and type-II diabetes. Network Hamiltonian models of aggregation are centered around a Hamiltonian function that returns the total energy of a system of aggregating proteins, given the graph structure of the system as an input. In the graph, or network, representation of the system, each protein molecule is represented as a node, and noncovalent bonds between proteins are represented as edges. The parameter, i.e., a set of coefficients that determine the degree to which each topological degree of freedom is favored or disfavored, must be determined for each network Hamiltonian model, and is a well-known technical challenge. The methodology is first demonstrated by beginning with an initial set of randomly parametrized models of low fibril fraction (<5% fibrillar), and evolving to subsequent generations of models, ultimately leading to high fibril fraction models (>70% fibrillar). The methodology is also demonstrated by applying it to optimizing previously published network Hamiltonian models for the 5 key amyloid fibril topologies that have been reported in the Protein Data Bank (PDB). The models generated by the AI produced fibril fractions that surpass previously published fibril fractions in 3 of 5 cases, including the most naturally abundant amyloid fibril topology, the 1,2 2-ribbon , which features a steric zipper. The authors also aim to encourage more widespread use of the network Hamiltonian methodology for fitting a wide variety of self-assembling systems by releasing a free open-source implementation of the genetic algorithm introduced here.
Keyphrases
- molecular dynamics simulations
- deep learning
- machine learning
- molecular dynamics
- cardiovascular disease
- healthcare
- primary care
- protein protein
- genome wide
- binding protein
- adipose tissue
- lymph node
- gene expression
- high throughput
- copy number
- molecular docking
- convolutional neural network
- high resolution
- quality improvement
- skeletal muscle
- cognitive decline
- meta analyses