Application of beta and gamma carbonic anhydrase sequences as tools for identification of bacterial contamination in the whole genome sequence of inbred Wuzhishan minipig (Sus scrofa) annotated in databases.
Reza Zolfaghari EmamehSeyed Nezamedin HosseiniSeppo ParkkilaPublished in: Database : the journal of biological databases and curation (2021)
Sus scrofa or pig was domesticated thousands of years ago. Through various indigenous breeds, different phenotypes were produced such as Chinese inbred miniature minipig or Wuzhishan pig (WZSP), which is broadly used in the life and medical sciences. The whole genome of WZSP was sequenced in 2012. Through a bioinformatics study of pig carbonic anhydrase (CA) sequences, we detected some β- and γ-class CAs among the WZSP CAs annotated in databases, while β- or γ-CAs had not previously been described in vertebrates. This finding urged us to analyze the quality of whole genome sequence of WZSP for the possible bacterial contamination. In this study, we used bioinformatics methods and web tools such as UniProt, European Bioinformatics Institute, National Center for Biotechnology Information, Ensembl Genome Browser, Ensembl Bacteria, RSCB PDB and Pseudomonas Genome Database. Our analysis defined that pig has 12 classical α-CAs and 3 CA-related proteins. Meanwhile, it was approved that the detected CAs in WZSP are categorized in the β- and γ-CA families, which belong to Pseudomonas spp. and Acinetobacter spp. The protein structure study revealed that the identified β-CA sequence from WZSP belongs to Pseudomonas aeruginosa with PDB ID: 5JJ8, and the identified γ-CA sequence from WZSP belongs to P. aeruginosa with PDB ID: 3PMO. Bioinformatics and computational methods accompanied with bacterial-specific markers, such as 16S rRNA and β- and γ-class CA sequences, can be used to identify bacterial contamination in mammalian DNA samples.
Keyphrases
- crispr cas
- genome editing
- pseudomonas aeruginosa
- risk assessment
- protein kinase
- healthcare
- cystic fibrosis
- escherichia coli
- biofilm formation
- climate change
- genome wide
- quality improvement
- emergency department
- dna methylation
- gene expression
- multidrug resistant
- amino acid
- health information
- cell free
- circulating tumor
- deep learning
- protein protein