Streaming histogram sketching for rapid microbiome analytics.
Will P M RoweAnna Paola CarrieriCristina Alcon-GinerShabhonam CaimAlex ShawKathleen SimJ Simon KrollLindsay J HallEdward O Pyzer-KnappMartyn D WinnPublished in: Microbiome (2019)
Our method offers a new approach to rapidly process microbiome data streams, allowing samples to be rapidly clustered, indexed and classified. We also provide our implementation, Histosketching Using Little K-mers (HULK), which can histosketch a typical 2 GB microbiome in 50 s on a standard laptop using four cores, with the sketch occupying 3000 bytes of disk space. ( https://github.com/will-rowe/hulk ).