Structure-based deep learning for binding site detection in nucleic acid macromolecules.
Igor KozlovskiiPetr A PopovPublished in: NAR genomics and bioinformatics (2021)
Structure-based drug design (SBDD) targeting nucleic acid macromolecules, particularly RNA, is a gaining momentum research direction that already resulted in several FDA-approved compounds. Similar to proteins, one of the critical components in SBDD for RNA is the correct identification of the binding sites for putative drug candidates. RNAs share a common structural organization that, together with the dynamic nature of these molecules, makes it challenging to recognize binding sites for small molecules. Moreover, there is a need for structure-based approaches, as sequence information only does not consider conformation plasticity of nucleic acid macromolecules. Deep learning holds a great promise to resolve binding site detection problem, but requires a large amount of structural data, which is very limited for nucleic acids, compared to proteins. In this study we composed a set of ∼2000 nucleic acid-small molecule structures comprising ∼2500 binding sites, which is ∼40-times larger than previously used one, and demonstrated the first structure-based deep learning approach, BiteNet N , to detect binding sites in nucleic acid structures. BiteNet N operates with arbitrary nucleic acid complexes, shows the state-of-the-art performance, and can be helpful in the analysis of different conformations and mutant variants, as we demonstrated for HIV-1 TAR RNA and ATP-aptamer case studies.
Keyphrases
- nucleic acid
- deep learning
- small molecule
- artificial intelligence
- high resolution
- machine learning
- convolutional neural network
- hepatitis c virus
- big data
- emergency department
- gold nanoparticles
- mass spectrometry
- dna methylation
- electronic health record
- copy number
- gene expression
- adverse drug
- hiv testing
- south africa
- amino acid
- crystal structure