SqueakOut: Autoencoder-based segmentation of mouse ultrasonic vocalizations.

Published in: bioRxiv : the preprint server for biology (2024)

Mice emit ultrasonic vocalizations (USVs) that are important for social communication. Despite great advancements in tools to detect USVs from audio files in the recent years, highly accurate segmentation of USVs from spectrograms (i.e., removing noise) remains a significant challenge. Here, we present a new dataset of 12,954 annotated spectrograms explicitly labeled for mouse USV segmentation. Leveraging this dataset, we developed SqueakOut, a lightweight (4.6M parameters) fully convolutional autoencoder that achieves high accuracy in supervised segmentation of USVs from spectrograms, with a Dice score of 90.22. SqueakOut combines a MobileNetV2 backbone with skip connections and transposed convolutions to precisely segment USVs. Using stochastic data augmentation techniques and a hybrid loss function, SqueakOut learns robust segmentation across varying recording conditions. We evaluate SqueakOut's performance, demonstrating substantial improvements over existing methods like VocalMat (63.82 Dice score). The accurate USV segmentations enabled by SqueakOut will facilitate novel methods for vocalization classification and more accurate analysis of mouse communication. To promote further research, we release the annotated 12,954 spectrogram USV segmentation dataset and the SqueakOut implementation publicly.

Keyphrases