Consistency-based Self-supervised Learning for Temporal Anomaly Localization

Aniello Panariello, Angelo Porrello, Simone Calderara, Rita Cucchiara

2022-08-10Anomaly Detection In Surveillance Videos Anomaly Localization Self-Supervised Learning Weakly-supervised Temporal Action Localization Supervised Anomaly Detection Weakly-supervised Anomaly Detection

Paper PDF Code(official)

Abstract

This work tackles Weakly Supervised Anomaly detection, in which a predictor is allowed to learn not only from normal examples but also from a few labeled anomalies made available during training. In particular, we deal with the localization of anomalous activities within the video stream: this is a very challenging scenario, as training examples come only with video-level annotations (and not frame-level). Several recent works have proposed various regularization terms to address it i.e. by enforcing sparsity and smoothness constraints over the weakly-learned frame-level anomaly scores. In this work, we get inspired by recent advances within the field of self-supervised learning and ask the model to yield the same scores for different augmentations of the same video sequence. We show that enforcing such an alignment improves the performance of the model on XD-Violence.

Results

Task	Dataset	Metric	Value	Model
Video Understanding	XD-Violence	AP	71.68	CSL_TAL
Video	XD-Violence	AP	71.68	CSL_TAL
Anomaly Detection	XD-Violence	AP	71.68	CSL_TAL

Related Papers

3DKeyAD: High-Resolution 3D Point Cloud Anomaly Detection via Keypoint-Guided Point Clustering2025-07-17 A Semi-Supervised Learning Method for the Identification of Bad Exposures in Large Imaging Surveys2025-07-17 Self-supervised Learning on Camera Trap Footage Yields a Strong Universal Face Embedder2025-07-14 Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis2025-07-08 World4Drive: End-to-End Autonomous Driving via Intention-aware Physical Latent World Model2025-07-01 ShapeEmbed: a self-supervised learning framework for 2D contour quantification2025-07-01 RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation Models2025-06-27 Boosting Generative Adversarial Transferability with Self-supervised Vision Transformer Features2025-06-26