Weakly Supervised Video Anomaly Detection Based on 3D Convolution and LSTM.
Zhen MaJosé J M MachadoJoão Manuel R S TavaresPublished in: Sensors (Basel, Switzerland) (2021)
Weakly supervised video anomaly detection is a recent focus of computer vision research thanks to the availability of large-scale weakly supervised video datasets. However, most existing research works are limited to the frame-level classification with emphasis on finding the presence of specific objects or activities. In this article, a new neural network architecture is proposed to efficiently extract the prominent features for detecting whether a video contains anomalies. A video is treated as an integral input and the detection follows the procedure of video-label assignment. The extraction of spatial and temporal features is carried out by three-dimensional convolutions, and then their relationship is further modeled using an LSTM network. The concise structure of the proposed method enables high computational efficiency, and extensive experiments demonstrate its effectiveness.