Leveraging Existing 16S rRNA Gene Surveys To Identify Reproducible Biomarkers in Individuals with Colorectal Tumors.
Marc A SzePatrick D SchlossPublished in: mBio (2018)
An increasing body of literature suggests that both individual and collections of bacteria are associated with the progression of colorectal cancer. As the number of studies investigating these associations increases and the number of subjects in each study increases, a meta-analysis to identify the associations that are the most predictive of disease progression is warranted. We analyzed previously published 16S rRNA gene sequencing data collected from feces and colon tissue. We quantified the odds ratios (ORs) for individual bacterial taxa that were associated with an individual having tumors relative to a normal colon. Among the fecal samples, there were no taxa that had significant ORs associated with adenoma and there were 8 taxa with significant ORs associated with carcinoma. Similarly, among the tissue samples, there were no taxa that had a significant OR associated with adenoma and there were 3 taxa with significant ORs associated with carcinoma. Among the significant ORs, the association between individual taxa and tumor diagnosis was equal to or below 7.11. Because individual taxa had limited association with tumor diagnosis, we trained Random Forest classification models using only the taxa that had significant ORs, using the entire collection of taxa found in each study, and using operational taxonomic units defined based on a 97% similarity threshold. All training approaches yielded similar classification success as measured using the area under the curve. The ability to correctly classify individuals with adenomas was poor, and the ability to classify individuals with carcinomas was considerably better using sequences from feces or tissue.IMPORTANCE Colorectal cancer is a significant and growing health problem in which animal models and epidemiological data suggest that the colonic microbiota have a role in tumorigenesis. These observations indicate that the colonic microbiota is a reservoir of biomarkers that may improve our ability to detect colonic tumors using noninvasive approaches. This meta-analysis identifies and validates a set of 8 bacterial taxa that can be used within a Random Forest modeling framework to differentiate individuals as having normal colons or carcinomas. When models trained using one data set were tested on other data sets, the models performed well. These results lend support to the use of fecal biomarkers for the detection of tumors. Furthermore, these biomarkers are plausible candidates for further mechanistic studies into the role of the gut microbiota in tumorigenesis.