Investigating the Diversity of Tuberculosis Spoligotypes with Dimensionality Reduction and Graph Theory.
Gaetan SenelleChristophe GuyeuxGuislaine RefrégierChristophe SolaPublished in: Genes (2022)
The spoligotype is a graphical description of the CRISPR locus present in Mycobacterium tuberculosis , which has the particularity of having only 68 possible spacers. This spoligotype, which can be easily obtained either in vitro or in silico, allows to have a summary information of lineage or even antibiotic resistance (when known to be associated to a particular cluster) at a lower cost. The objective of this article is to show that this representation is richer than it seems, and that it is under-exploited until now. We first recall an original way to represent these spoligotypes as points in the plane, allowing to highlight possible sub-lineages, particularities in the animal strains, etc. This graphical representation shows clusters and a skeleton in the form of a graph, which led us to see these spoligotypes as vertices of an unconnected directed graph. In this paper, we therefore propose to exploit in detail the description of the variety of spoligotypes using a graph, and we show to what extent such a description can be informative.