PlasmidHunter: accurate and fast prediction of plasmid sequences using gene content profile and machine learning.
Ren-Mao TianJizhong ZhouBehzad ImanianPublished in: Briefings in bioinformatics (2024)
Plasmids are extrachromosomal DNA found in microorganisms. They often carry beneficial genes that help bacteria adapt to harsh conditions. Plasmids are also important tools in genetic engineering, gene therapy, and drug production. However, it can be difficult to identify plasmid sequences from chromosomal sequences in genomic and metagenomic data. Here, we have developed a new tool called PlasmidHunter, which uses machine learning to predict plasmid sequences based on gene content profile. PlasmidHunter can achieve high accuracies (up to 97.6%) and high speeds in benchmark tests including both simulated contigs and real metagenomic plasmidome data, outperforming other existing tools.
Keyphrases
- escherichia coli
- copy number
- machine learning
- genome wide
- gene therapy
- big data
- genome wide identification
- crispr cas
- electronic health record
- artificial intelligence
- klebsiella pneumoniae
- antibiotic resistance genes
- circulating tumor
- genetic diversity
- genome wide analysis
- single molecule
- emergency department
- cell free
- data analysis
- multidrug resistant