Analysis of nucleotide variations in human g-quadruplex forming regions associated with disease states.
Aryan NeupaneJulia H CharikerEric C RouchkaPublished in: bioRxiv : the preprint server for biology (2023)
While the role of G4 G quadruplex structures has been identified in cancers and metabolic disorders, single nucleotide variations (SNVs) and their effect on G4s in disease contexts have not been extensively studied. The COSMIC and CLINVAR databases were used to detect SNVs present in G4s to identify sequence level changes and their effect on alteration of G4 secondary structure. 37,515 G4 SNVs in the COSMIC database and 2,115 in CLINVAR were identified. Of those, 7,236 COSMIC (19.3%) and 416 (18%) of the CLINVAR variants result in G4 loss, while 2,728 (COSMIC) and 112 (CLINVAR) SNVs gain a G4 structure. The gene ontology term "GnRH (Gonadotropin-releasing hormone) secretion" is enriched in 21 genes in this pathway that have a G4 destabilizing SNV. Analysis of mutational patterns in the G4 structure show a higher selective pressure (3-fold) in the coding region on the template strand compared to the non-template strand. At the same time, an equal proportion of SNVs were observed among intronic, promoter and enhancer regions across strands. Using GO and pathway enrichment, genes with SNVs for G4 forming propensity in the coding region are enriched for Regulation of Ras protein signal transduction and Src homology 3 (SH3) domain binding.
Keyphrases
- genome wide
- genome wide identification
- copy number
- transcription factor
- dna methylation
- endothelial cells
- binding protein
- gene expression
- multidrug resistant
- emergency department
- high resolution
- preterm infants
- tyrosine kinase
- molecularly imprinted
- deep learning
- young adults
- artificial intelligence
- small molecule
- mass spectrometry
- gestational age
- dna binding
- childhood cancer