Login / Signup

Classifying the Unclassified: A Phage Classification Method.

Cynthia Maria ChibaniAnton FarrSandra KlamaSascha DietrichHeiko Liesegang
Published in: Viruses (2019)
This work reports the method ClassiPhage to classify phage genomes using sequence derived taxonomic features. ClassiPhage uses a set of phage specific Hidden Markov Models (HMMs) generated from clusters of related proteins. The method was validated on all publicly available genomes of phages that are known to infect Vibrionaceae. The phages belong to the well-described phage families of Myoviridae, Podoviridae, Siphoviridae, and Inoviridae. The achieved classification is consistent with the assignments of the International Committee on Taxonomy of Viruses (ICTV), all tested phages were assigned to the corresponding group of the ICTV-database. In addition, 44 out of 58 genomes of Vibrio phages not yet classified could be assigned to a phage family. The remaining 14 genomes may represent phages of new families or subfamilies. Comparative genomics indicates that the ability of the approach to identify and classify phages is correlated to the conserved genomic organization. ClassiPhage classifies phages exclusively based on genome sequence data and can be applied on distinct phage genomes as well as on prophage regions within host genomes. Possible applications include (a) classifying phages from assembled metagenomes; and (b) the identification and classification of integrated prophages and the splitting of phage families into subfamilies.
Keyphrases
  • pseudomonas aeruginosa
  • machine learning
  • deep learning
  • cystic fibrosis
  • biofilm formation
  • single cell
  • escherichia coli
  • emergency department
  • artificial intelligence
  • electronic health record
  • genome wide