Login / Signup

A mutation-level covariate model for mutational signatures.

Itay KahaneMark D M LeisersonRoded Sharan
Published in: PLoS computational biology (2023)
Mutational processes and their exposures in particular genomes are key to our understanding of how these genomes are shaped. However, current analyses assume that these processes are uniformly active across the genome without accounting for potential covariates such as strand or genomic region that could impact such activities. Here we suggest the first mutation-covariate models that explicitly model the effect of different covariates on the exposures of mutational processes. We apply these models to test the impact of replication strand on these processes and compare them to strand-oblivious models across a range of data sets. Our models capture replication strand specificity, point to signatures affected by it, and score better on held-out data compared to standard models that do not account for mutation-level covariate information.
Keyphrases
  • genome wide
  • air pollution
  • big data
  • healthcare
  • machine learning
  • gene expression
  • copy number
  • health information