Meta-analysis of Epstein-Barr virus genomes in Southern Chinese identifies genetic variants and high risk viral lineage associated with nasopharyngeal carcinoma.
Ka Wo WongKwai Fung HuiKi Pui LamDora Lai-Wan KwongMaria Li LungWanling YangAlan Kwok Shing ChiangPublished in: PLoS pathogens (2024)
Genetic variants in Epstein-Barr virus (EBV) have been strongly associated with nasopharyngeal carcinoma (NPC) in South China. However, different results regarding the most significant viral variants, with polymorphisms in EBER2 and BALF2 loci, have been reported in separate studies. In this study, we newly sequenced 100 EBV genomes derived from 61 NPC cases and 39 population controls. Comprehensive genomic analyses of EBV sequences from both NPC patients and healthy carriers in South China were conducted, totaling 279 cases and 227 controls. Meta-analysis of genome-wide association study revealed a 4-bp deletion downstream of EBER2 (coordinates, 7188-7191; EBER-del) as the most significant variant associated with NPC. Furthermore, multiple viral variants were found to be genetically linked to EBER-del forming a risk haplotype, suggesting that multiple viral variants might be associated with NPC pathogenesis. Population structure and phylogenetic analyses further characterized a high risk EBV lineage for NPC revealing a panel of 38 single nucleotide polymorphisms (SNPs), including those in the EBER2 and BALF2 loci. With linkage disequilibrium clumping and feature selection algorithm, the 38 SNPs could be narrowed down to 9 SNPs which can be used to accurately detect the high risk EBV lineage. In summary, our study provides novel insight into the role of EBV genetic variation in NPC pathogenesis by defining a risk haplotype of EBV for downstream functional studies and identifying a single high risk EBV lineage characterized by 9 SNPs for potential application in population screening of NPC.
Keyphrases
- epstein barr virus
- genome wide
- diffuse large b cell lymphoma
- copy number
- genome wide association study
- sars cov
- single cell
- systematic review
- genome wide association
- end stage renal disease
- machine learning
- chronic kidney disease
- ejection fraction
- case control
- deep learning
- risk assessment
- newly diagnosed
- hiv infected