HyperTraPS-CT: Inference and prediction for accumulation pathways with flexible data and model structures.
Olav N L AgaMorten BrunKazeem A DaudaRamon Diaz-UriarteKonstantinos GiannakisIain G JohnstonPublished in: PLoS computational biology (2024)
Accumulation processes, where many potentially coupled features are acquired over time, occur throughout the sciences, from evolutionary biology to disease progression, and particularly in the study of cancer progression. Existing methods for learning the dynamics of such systems typically assume limited (often pairwise) relationships between feature subsets, cross-sectional or untimed observations, small feature sets, or discrete orderings of events. Here we introduce HyperTraPS-CT (Hypercubic Transition Path Sampling in Continuous Time) to compute posterior distributions on continuous-time dynamics of many, arbitrarily coupled, traits in unrestricted state spaces, accounting for uncertainty in observations and their timings. We demonstrate the capacity of HyperTraPS-CT to deal with cross-sectional, longitudinal, and phylogenetic data, which may have no, uncertain, or precisely specified sampling times. HyperTraPS-CT allows positive and negative interactions between arbitrary subsets of features (not limited to pairwise interactions), supporting Bayesian and maximum-likelihood inference approaches to identify these interactions, consequent pathways, and predictions of future and unobserved features. We also introduce a range of visualisations for the inferred outputs of these processes and demonstrate model selection and regularisation for feature interactions. We apply this approach to case studies on the accumulation of mutations in cancer progression and the acquisition of anti-microbial resistance genes in tuberculosis, demonstrating its flexibility and capacity to produce predictions aligned with applied priorities.
Keyphrases
- cross sectional
- image quality
- dual energy
- computed tomography
- contrast enhanced
- papillary thyroid
- machine learning
- genome wide
- deep learning
- positron emission tomography
- magnetic resonance imaging
- squamous cell
- big data
- peripheral blood
- electronic health record
- mycobacterium tuberculosis
- high resolution
- magnetic resonance
- emergency department
- microbial community
- lymph node metastasis
- current status
- hepatitis c virus
- transcription factor
- neural network
- adverse drug
- genome wide identification