TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Learning Latent Super-Events to Detect Multiple Activities...

Learning Latent Super-Events to Detect Multiple Activities in Videos

AJ Piergiovanni, Michael S. Ryoo

2017-12-05CVPR 2018 6Action DetectionActivity Detection
PaperPDFCodeCode(official)

Abstract

In this paper, we introduce the concept of learning latent super-events from activity videos, and present how it benefits activity detection in continuous videos. We define a super-event as a set of multiple events occurring together in videos with a particular temporal organization; it is the opposite concept of sub-events. Real-world videos contain multiple activities and are rarely segmented (e.g., surveillance videos), and learning latent super-events allows the model to capture how the events are temporally related in videos. We design temporal structure filters that enable the model to focus on particular sub-intervals of the videos, and use them together with a soft attention mechanism to learn representations of latent super-events. Super-event representations are combined with per-frame or per-segment CNNs to provide frame-level annotations. Our approach is designed to be fully differentiable, enabling end-to-end learning of latent super-event representations jointly with the activity detector using them. Our experiments with multiple public video datasets confirm that the proposed concept of latent super-event learning significantly benefits activity detection, advancing the state-of-the-arts.

Results

TaskDatasetMetricValueModel
Action DetectionMulti-THUMOSmAP36.4I3D + our super-event
Action DetectionCharadesmAP19.41Super-events (RGB+Flow)

Related Papers

CBF-AFA: Chunk-Based Multi-SSL Fusion for Automatic Fluency Assessment2025-06-25MultiHuman-Testbench: Benchmarking Image Generation for Multiple Humans2025-06-25Distributed Activity Detection for Cell-Free Hybrid Near-Far Field Communications2025-06-17Speaker Diarization with Overlapping Community Detection Using Graph Attention Networks and Label Propagation Algorithm2025-06-03Attention Is Not Always the Answer: Optimizing Voice Activity Detection with Simple Feature Fusion2025-06-02Joint Activity Detection and Channel Estimation for Massive Connectivity: Where Message Passing Meets Score-Based Generative Priors2025-05-31Towards Robust Overlapping Speech Detection: A Speaker-Aware Progressive Approach Using WavLM2025-05-29Robust Activity Detection for Massive Random Access2025-05-21