Identification of Novel Genes and Proteoforms in Angiostrongylus costaricensis through a Proteogenomic Approach.
Esdras Matheus Gomes da SilvaKarina Mastropasqua RebelloYoung-Jun ChoiVitor GregorioAlexandre Rossi PaschoalMakedonka MitrevaJames H McKerrowAna Gisele da Costa Neves-FerreiraFabio PassettiPublished in: Pathogens (Basel, Switzerland) (2022)
RNA sequencing (RNA-Seq) and mass-spectrometry-based proteomics data are often integrated in proteogenomic studies to assist in the prediction of eukaryote genome features, such as genes, splicing, single-nucleotide (SNVs), and single-amino-acid variants (SAAVs). Most genomes of parasite nematodes are draft versions that lack transcript- and protein-level information and whose gene annotations rely only on computational predictions. Angiostrongylus costaricensis is a roundworm species that causes an intestinal inflammatory disease, known as abdominal angiostrongyliasis (AA). Currently, there is no drug available that acts directly on this parasite, mostly due to the sparse understanding of its molecular characteristics. The available genome of A. costaricensis , specific to the Costa Rica strain, is a draft version that is not supported by transcript- or protein-level evidence. This study used RNA-Seq and MS/MS data to perform an in-depth annotation of the A. costaricensis genome. Our prediction improved the reference annotation with (a) novel coding and non-coding genes; (b) pieces of evidence of alternative splicing generating new proteoforms; and (c) a list of SNVs between the Brazilian (Crissiumal) and the Costa Rica strain. To the best of our knowledge, this is the first time that a multi-omics approach has been used to improve the genome annotation of A. costaricensis . We hope this improved genome annotation can assist in the future development of drugs, kits, and vaccines to treat, diagnose, and prevent AA caused by either the Brazil strain (Crissiumal) or the Costa Rica strain.
Keyphrases
- rna seq
- single cell
- genome wide
- mass spectrometry
- amino acid
- dna methylation
- copy number
- genome wide identification
- ms ms
- bioinformatics analysis
- healthcare
- electronic health record
- liquid chromatography
- genome wide analysis
- big data
- toxoplasma gondii
- high resolution
- machine learning
- transcription factor
- capillary electrophoresis
- drug induced
- plasmodium falciparum
- small molecule
- gas chromatography
- liquid chromatography tandem mass spectrometry
- optical coherence tomography