SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants.
Houcemeddine OthmanSherlyn JemimahJorge Emanuel Batista da RochaPublished in: Journal of personalized medicine (2022)
Recent genomic studies have revealed the critical impact of genetic diversity within small population groups in determining the way individuals respond to drugs. One of the biggest challenges is to accurately predict the effect of single nucleotide variants and to get the relevant information that allows for a better functional interpretation of genetic data. Different conformational scenarios upon the changing in amino acid sequences of pharmacologically important proteins might impact their stability and plasticity, which in turn might alter the interaction with the drug. Current sequence-based annotation methods have limited power to access this type of information. Motivated by these calls, we have developed the Structural Workflow for Annotating ADME Targets (SWAAT) that allows for the prediction of the variant effect based on structural properties. SWAAT annotates a panel of 36 ADME genes including 22 out of the 23 clinically important members identified by the PharmVar consortium. The workflow consists of a set of Python codes of which the execution is managed within Nextflow to annotate coding variants based on 37 criteria. SWAAT also includes an auxiliary workflow allowing a versatile use for genes other than ADME members. Our tool also includes a machine learning random forest binary classifier that showed an accuracy of 73%. Moreover, SWAAT outperformed six commonly used sequence-based variant prediction tools (PROVEAN, SIFT, PolyPhen-2, CADD, MetaSVM, and FATHMM) in terms of sensitivity and has comparable specificity. SWAAT is available as an open-source tool.
Keyphrases
- copy number
- genome wide
- molecular docking
- electronic health record
- amino acid
- genetic diversity
- machine learning
- climate change
- molecular dynamics simulations
- dna methylation
- genome wide identification
- adverse drug
- genome wide analysis
- healthcare
- molecular dynamics
- sensitive detection
- drug induced
- small molecule
- single cell
- single molecule
- protein protein
- neural network