Login / Signup

PhaGenus: genus-level classification of bacteriophages using a Transformer model.

Jiaojiao GuanCheng PengJiayu ShangXubo TangYanni Sun
Published in: Briefings in bioinformatics (2023)
In this work, we develop a learning-based model named PhaGenus, which conducts genus-level taxonomic classification for phage contigs. PhaGenus utilizes a powerful Transformer model to learn the association between protein clusters and support the classification of up to 508 genera. We tested PhaGenus on four datasets in different scenarios. The experimental results show that PhaGenus outperforms state-of-the-art methods in predicting low-similarity datasets, achieving an improvement of at least 13.7%. Additionally, PhaGenus is highly effective at identifying previously uncharacterized genera that are not represented in reference databases, with an improvement of 8.52%. The analysis of the infants' gut and GOV2.0 dataset demonstrates that PhaGenus can be used to classify more contigs with higher accuracy.
Keyphrases
  • machine learning
  • deep learning
  • climate change
  • small molecule
  • artificial intelligence