A scale-free analysis of the HIV-1 genome demonstrates multiple conserved regions of structural and functional importance.
Jordan Peter SkittrallCarin K IngemarsdotterJulia R GogAndrew M L LeverPublished in: PLoS computational biology (2019)
HIV-1 replicates via a low-fidelity polymerase with a high mutation rate; strong conservation of individual nucleotides is highly indicative of the presence of critical structural or functional properties. Identifying such conservation can reveal novel insights into viral behaviour. We analysed 3651 publicly available sequences for the presence of nucleic acid conservation beyond that required by amino acid constraints, using a novel scale-free method that identifies regions of outlying score together with a codon scoring algorithm. Sequences with outlying score were further analysed using an algorithm for producing local RNA folds whilst accounting for alignment properties. 11 different conserved regions were identified, some corresponding to well-known cis-acting functions of the HIV-1 genome but also others whose conservation has not previously been noted. We identify rational causes for many of these, including cis functions, possible additional reading frame usage, a plausible mechanism by which the central polypurine tract primes second-strand DNA synthesis and a conformational stabilising function of a region at the 5' end of env.
Keyphrases
- nucleic acid
- antiretroviral therapy
- hiv positive
- hiv infected
- hiv testing
- human immunodeficiency virus
- hepatitis c virus
- genome wide
- hiv aids
- men who have sex with men
- machine learning
- amino acid
- deep learning
- transcription factor
- single molecule
- dna methylation
- gene expression
- circulating tumor
- molecular dynamics simulations
- cell free