Context dependency of nucleotide probabilities and variants in human DNA.
Yuhu LiangChristian GrønbækPiero FariselliAnders KroghPublished in: BMC genomics (2022)
Our study found strong context dependencies of nucleotides in the human genome. The best model uses a context of 14 nucleotides to each side. Based on these models, a substitution model was constructed that separates into the context model and a matrix dependent on a small context. The model fit somatic variants particularly well.