A Practical Bioinformatics Workflow for Routine Analysis of Bacterial WGS Data.
Aitor Atxaerandio-LandaAinhoa Arrieta-GisasolaLorena LaordenJoseba BikandiJavier GaraizarIrati Martinez-MalaxetxebarriaIlargi Martinez-BallesterosPublished in: Microorganisms (2022)
The use of whole-genome sequencing (WGS) for bacterial characterisation has increased substantially in the last decade. Its high throughput and decreasing cost have led to significant changes in outbreak investigations and surveillance of a wide variety of microbial pathogens. Despite the innumerable advantages of WGS, several drawbacks concerning data analysis and management, as well as a general lack of standardisation, hinder its integration in routine use. In this work, a bioinformatics workflow for (Illumina) WGS data is presented for bacterial characterisation including genome annotation, species identification, serotype prediction, antimicrobial resistance prediction, virulence-related genes and plasmid replicon detection, core-genome-based or single nucleotide polymorphism (SNP)-based phylogenetic clustering and sequence typing. Workflow was tested using a collection of 22 in-house sequences of Salmonella enterica isolates belonging to a local outbreak, coupled with a collection of 182 Salmonella genomes publicly available. No errors were reported during the execution period, and all genomes were analysed. The bioinformatics workflow can be tailored to other pathogens of interest and is freely available for academic and non-profit use as an uploadable file to the Galaxy platform.
Keyphrases
- antimicrobial resistance
- electronic health record
- data analysis
- high throughput
- genetic diversity
- escherichia coli
- genome wide
- adverse drug
- single cell
- clinical practice
- rna seq
- big data
- public health
- microbial community
- crispr cas
- klebsiella pneumoniae
- patient safety
- cystic fibrosis
- dna methylation
- quality improvement
- sensitive detection
- drug induced