TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Temporal Action Localization in Untrimmed Videos via Multi...

Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs

Zheng Shou, Dongang Wang, Shih-Fu Chang

2016-01-09CVPR 2016 6Action ClassificationAction LocalizationTemporal LocalizationGeneral ClassificationClassificationTemporal Action Localization
PaperPDFCode(official)

Abstract

We address temporal action localization in untrimmed long videos. This is important because videos in real applications are usually unconstrained and contain multiple action instances plus video content of background scenes or other activities. To address this challenging issue, we exploit the effectiveness of deep networks in temporal action localization via three segment-based 3D ConvNets: (1) a proposal network identifies candidate segments in a long video that may contain actions; (2) a classification network learns one-vs-all action classification model to serve as initialization for the localization network; and (3) a localization network fine-tunes on the learned classification network to localize each action instance. We propose a novel loss function for the localization network to explicitly consider temporal overlap and therefore achieve high temporal localization accuracy. Only the proposal network and the localization network are used during prediction. On two large-scale benchmarks, our approach achieves significantly superior performances compared with other state-of-the-art systems: mAP increases from 1.7% to 7.4% on MEXaction2 and increases from 15.0% to 19.0% on THUMOS 2014, when the overlap threshold for evaluation is set to 0.5.

Results

TaskDatasetMetricValueModel
VideoMEXaction2mAP7.4S-CNN
VideoTHUMOS’14mAP IOU@0.147.7S-CNN
VideoTHUMOS’14mAP IOU@0.243.5S-CNN
VideoTHUMOS’14mAP IOU@0.336.3S-CNN
VideoTHUMOS’14mAP IOU@0.428.7S-CNN
VideoTHUMOS’14mAP IOU@0.519S-CNN
Temporal Action LocalizationMEXaction2mAP7.4S-CNN
Temporal Action LocalizationTHUMOS’14mAP IOU@0.147.7S-CNN
Temporal Action LocalizationTHUMOS’14mAP IOU@0.243.5S-CNN
Temporal Action LocalizationTHUMOS’14mAP IOU@0.336.3S-CNN
Temporal Action LocalizationTHUMOS’14mAP IOU@0.428.7S-CNN
Temporal Action LocalizationTHUMOS’14mAP IOU@0.519S-CNN
Zero-Shot LearningMEXaction2mAP7.4S-CNN
Zero-Shot LearningTHUMOS’14mAP IOU@0.147.7S-CNN
Zero-Shot LearningTHUMOS’14mAP IOU@0.243.5S-CNN
Zero-Shot LearningTHUMOS’14mAP IOU@0.336.3S-CNN
Zero-Shot LearningTHUMOS’14mAP IOU@0.428.7S-CNN
Zero-Shot LearningTHUMOS’14mAP IOU@0.519S-CNN
Activity RecognitionTHUMOS’14mAP@0.147.7Shou et. al.
Activity RecognitionTHUMOS’14mAP@0.243.5Shou et. al.
Activity RecognitionTHUMOS’14mAP@0.336.3Shou et. al.
Activity RecognitionTHUMOS’14mAP@0.428.7Shou et. al.
Activity RecognitionTHUMOS’14mAP@0.519Shou et. al.
Action LocalizationMEXaction2mAP7.4S-CNN
Action LocalizationTHUMOS’14mAP IOU@0.147.7S-CNN
Action LocalizationTHUMOS’14mAP IOU@0.243.5S-CNN
Action LocalizationTHUMOS’14mAP IOU@0.336.3S-CNN
Action LocalizationTHUMOS’14mAP IOU@0.428.7S-CNN
Action LocalizationTHUMOS’14mAP IOU@0.519S-CNN
Action RecognitionTHUMOS’14mAP@0.147.7Shou et. al.
Action RecognitionTHUMOS’14mAP@0.243.5Shou et. al.
Action RecognitionTHUMOS’14mAP@0.336.3Shou et. al.
Action RecognitionTHUMOS’14mAP@0.428.7Shou et. al.
Action RecognitionTHUMOS’14mAP@0.519Shou et. al.

Related Papers

Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Safeguarding Federated Learning-based Road Condition Classification2025-07-16DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13Fuzzy Classification Aggregation for a Continuum of Agents2025-07-06Hybrid-View Attention for csPCa Classification in TRUS2025-07-04Devising a solution to the problems of Cancer awareness in Telangana2025-06-26