Login / Signup

Resolving complex structures at oncovirus integration loci with conjugate graph.

Wenlong JiaChang XuShuai Cheng Li
Published in: Briefings in bioinformatics (2022)
Oncovirus integrations cause copy number variations and complex structural variations (SVs) on host genomes. However, the understanding of how inserted viral DNA impacts the local genome remains limited. The linear structure of the oncovirus integrated local genomic map (LGM) will lay the foundations to understand how oncovirus integrations emerge and compromise the host genome's functioning. We propose a conjugate graph model to reconstruct the rearranged LGM at integrated loci. Simulation tests prove the reliability and credibility of the algorithm. Applications of the algorithm to whole-genome sequencing data of human papillomavirus (HPV) and hepatitis B virus (HBV)-infected cancer samples gained biological insights on oncovirus integrations. We observed four affection patterns of oncovirus integrations from the HPV and HBV-integrated cancer samples, including the coding-frame truncation, hyper-amplification of tumor gene, the viral cis-regulation inserted at the single intron and at the intergenic region. We found that the focal duplicates and host SVs are frequent in the HPV-integrated LGMs, while the focal deletions are prevalent in HBV-integrated LGMs. Furthermore, with the results yields from our method, we found the enhanced microhomology-mediated end joining might lead to both HPV and HBV integrations and conjectured that the HPV integrations might mainly occur during the DNA replication process. The conjugate graph algorithm code and LGM construction pipeline, available at https://github.com/deepomicslab/FuseSV.
Keyphrases