Exogenous sequences in tumors and immune cells (exotic): a tool for estimating the microbe abundances in tumor RNAseq data.
Rebecca HoydCaroline E DravillasYunZhou LiuMalvenderjit S Jagjit SinghMitchell MuniakNing JinNicholas C DenkoDavid P CarboneXiaokui M MoDaniel J SpakowiczPublished in: Cancer research communications (2023)
The microbiome affects cancer, from carcinogenesis to response to treatments. New evidence suggests that microbes are also present in many tumors, though the scope of how they affect tumor biology and clinical outcomes is in its early stages. A broad survey of tumor microbiome samples across several independent datasets is needed to identify robust correlations for follow-up testing. We created a tool called {exotic} for "exogenous sequences in tumors and immune cells" to carefully identify the tumor microbiome within RNAseq datasets. We applied it to samples collected through the Oncology Research Information Exchange Network (ORIEN) and The Cancer Genome Atlas (TCGA). We showed how the processing removes contaminants and batch effects to yield microbe abundances consistent with non-high-throughput sequencing-based approaches and DNA-amplicon-based measurements of a subset of the same tumors. We sought to establish clinical relevance by correlating the microbe abundances with various clinical and tumor measurements, such as age and tumor hypoxia. This process leveraged the two datasets and raised up only the concordant (significant and in the same direction) associations. We observed associations with survival and clinical variables that are cancer-specific and relatively few associations with immune composition. Finally, we explored potential mechanisms by which microbes and tumors may interact using a network-based approach. Alistipes, a common gut commensal, showed the highest network degree centrality and was associated with genes related to metabolism and inflammation. The {exotic} tool can support the discovery of microbes in tumors in a way that leverages the many existing and growing RNAseq datasets.