Retrospective Definition of Clostridioides difficile PCR Ribotypes on the Basis of Whole Genome Polymorphisms: A Proof of Principle Study.
Manisha GoyalLysiane HaubenHannes PouseeleMagali JaillardKatrien De BruyneAlex van BelkumRichard GoeringPublished in: Diagnostics (Basel, Switzerland) (2020)
Clostridioides difficile is a cause of health care-associated infections. The epidemiological study of C. difficile infection (CDI) traditionally involves PCR ribotyping. However, ribotyping will be increasingly replaced by whole genome sequencing (WGS). This implies that WGS types need correlation with classical ribotypes (RTs) in order to perform retrospective clinical studies. Here, we selected genomes of hyper-virulent C. difficile strains of RT001, RT017, RT027, RT078, and RT106 to try and identify new discriminatory markers using in silico ribotyping PCR and De Bruijn graph-based Genome Wide Association Studies (DBGWAS). First, in silico ribotyping PCR was performed using reference primer sequences and 30 C. difficile genomes of the five different RTs identified above. Second, discriminatory genomic markers were sought with DBGWAS using a set of 160 independent C. difficile genomes (14 ribotypes). RT-specific genetic polymorphisms were annotated and validated for their specificity and sensitivity against a larger dataset of 2425 C. difficile genomes covering 132 different RTs. In silico PCR ribotyping was unsuccessful due to non-specific or missing theoretical RT PCR fragments. More successfully, DBGWAS discovered a total of 47 new markers (13 in RT017, 12 in RT078, 9 in RT106, 7 in RT027, and 6 in RT001) with minimum q-values of 0 to 7.40 × 10-5, indicating excellent marker selectivity. The specificity and sensitivity of individual markers ranged between 0.92 and 1.0 but increased to 1 by combining two markers, hence providing undisputed RT identification based on a single genome sequence. Markers were scattered throughout the C. difficile genome in intra- and intergenic regions. We propose here a set of new genomic polymorphisms that efficiently identify five hyper-virulent RTs utilizing WGS data only. Further studies need to show whether this initial proof-of-principle observation can be extended to all 600 existing RTs.