Application of a Database-Independent Approach To Assess the Quality of Operational Taxonomic Unit Picking Methods.
Patrick D SchlossPublished in: mSystems (2016)
Assignment of 16S rRNA gene sequences to operational taxonomic units (OTUs) allows microbial ecologists to overcome the inconsistencies and biases within bacterial taxonomy and provides a strategy for clustering similar sequences that do not have representatives in a reference database. I have applied the Matthews correlation coefficient to assess the ability of 15 reference-independent and -dependent clustering algorithms to assign sequences to OTUs. This metric quantifies the ability of an algorithm to reflect the relationships between sequences without the use of a reference and can be applied to any data set or method. The most consistently robust method was the average neighbor algorithm; however, for some data sets, other algorithms matched its performance.