Learning dynamical information from static protein and sequencing data.
Philip PearceFrancis G WoodhouseAden ForrowAshley KellyHalim KusumaatmajaJörn DunkelPublished in: Nature communications (2019)
Many complex processes, from protein folding to neuronal network dynamics, can be described as stochastic exploration of a high-dimensional energy landscape. Although efficient algorithms for cluster detection in high-dimensional spaces have been developed over the last two decades, considerably less is known about the reliable inference of state transition dynamics in such settings. Here we introduce a flexible and robust numerical framework to infer Markovian transition networks directly from time-independent data sampled from stationary equilibrium distributions. We demonstrate the practical potential of the inference scheme by reconstructing the network dynamics for several protein-folding transitions, gene-regulatory network motifs, and HIV evolution pathways. The predicted network topologies and relative transition time scales agree well with direct estimates from time-dependent molecular dynamics data, stochastic simulations, and phylogenetic trees, respectively. Owing to its generic structure, the framework introduced here will be applicable to high-throughput RNA and protein-sequencing datasets, and future cryo-electron microscopy (cryo-EM) data.
Keyphrases
- molecular dynamics
- single cell
- electronic health record
- high throughput
- electron microscopy
- protein protein
- big data
- density functional theory
- amino acid
- machine learning
- single molecule
- binding protein
- healthcare
- hepatitis c virus
- human immunodeficiency virus
- data analysis
- hiv infected
- risk assessment
- small molecule
- mass spectrometry
- antiretroviral therapy
- current status
- loop mediated isothermal amplification
- climate change
- south africa
- nucleic acid
- label free