Login / Signup

Neural network and kinetic modelling of human genome replication reveal replication origin locations and strengths.

Jean-Michel ArbonaHadi KabalaneJeremy BarbierArach GoldarOlivier HyrienBenjamin Audit
Published in: PLoS computational biology (2023)
In human and other metazoans, the determinants of replication origin location and strength are still elusive. Origins are licensed in G1 phase and fired in S phase of the cell cycle, respectively. It is debated which of these two temporally separate steps determines origin efficiency. Experiments can independently profile mean replication timing (MRT) and replication fork directionality (RFD) genome-wide. Such profiles contain information on multiple origins' properties and on fork speed. Due to possible origin inactivation by passive replication, however, observed and intrinsic origin efficiencies can markedly differ. Thus, there is a need for methods to infer intrinsic from observed origin efficiency, which is context-dependent. Here, we show that MRT and RFD data are highly consistent with each other but contain information at different spatial scales. Using neural networks, we infer an origin licensing landscape that, when inserted in an appropriate simulation framework, jointly predicts MRT and RFD data with unprecedented precision and underlies the importance of dispersive origin firing. We furthermore uncover an analytical formula that predicts intrinsic from observed origin efficiency combined with MRT data. Comparison of inferred intrinsic origin efficiencies with experimental profiles of licensed origins (ORC, MCM) and actual initiation events (Bubble-seq, SNS-seq, OK-seq, ORM) show that intrinsic origin efficiency is not solely determined by licensing efficiency. Thus, human replication origin efficiency is set at both the origin licensing and firing steps.
Keyphrases
  • genome wide
  • neural network
  • cell cycle
  • endothelial cells
  • single cell
  • mass spectrometry
  • electronic health record
  • rna seq
  • healthcare
  • copy number
  • induced pluripotent stem cells
  • social media
  • data analysis