CAVE: Connectome Annotation Versioning Engine.

Sven Dorkenwald Casey M Schneider-Mizell Derrick Brittain Akhilesh HalageriChris JordanNico KemnitzManual A CastroWilliam M SilversmithJeremy Maitin-ShephardJakob TroidlHanspeter PfisterValentin GilletDaniel Xenes J Alexander Bae Agnes L Bodor JoAnn BuchananDaniel J BumbargerLeila ElabbadyZhen JiaDaniel KapnerSam Kinn Kisuk LeeKai LiRan Lu Thomas Macrina Gayathri MahalingamEric MitchellShanka Subhra MondalShang MuBarak NehoranSergiy PopovychMarc M Takeno Russel M Torres Nicholas L TurnerWilliam WongJingpeng Wu Wenjing Yin Szi-Chieh Yu R Clay Reid Nuno Maçarico da Costa H Sebastian Seung Forrest C Collman

Published in: bioRxiv : the preprint server for biology (2023)

Advances in Electron Microscopy, image segmentation and computational infrastructure have given rise to large-scale and richly annotated connectomic datasets which are increasingly shared across communities. To enable collaboration, users need to be able to concurrently create new annotations and correct errors in the automated segmentation by proofreading. In large datasets, every proofreading edit relabels cell identities of millions of voxels and thousands of annotations like synapses. For analysis, users require immediate and reproducible access to this constantly changing and expanding data landscape. Here, we present the Connectome Annotation Versioning Engine (CAVE), a computational infrastructure for immediate and reproducible connectome analysis in up-to petascale datasets (∼1mm 3 ) while proofreading and annotating is ongoing. For segmentation, CAVE provides a distributed proofreading infrastructure for continuous versioning of large reconstructions. Annotations in CAVE are defined by locations such that they can be quickly assigned to the underlying segment which enables fast analysis queries of CAVE's data for arbitrary time points. CAVE supports schematized, extensible annotations, so that researchers can readily design novel annotation types. CAVE is already used for many connectomics datasets, including the largest datasets available to date.

Keyphrases