Temporal Convolution Based Action Proposal: Submission to ActivityNet 2017

Tianwei Lin, Xu Zhao, Zheng Shou

2017-07-21Action Classification Action Localization General Classification Temporal Action Localization

Abstract

In this notebook paper, we describe our approach in the submission to the temporal action proposal (task 3) and temporal action localization (task 4) of ActivityNet Challenge hosted at CVPR 2017. Since the accuracy in action classification task is already very high (nearly 90% in ActivityNet dataset), we believe that the main bottleneck for temporal action localization is the quality of action proposals. Therefore, we mainly focus on the temporal action proposal task and propose a new proposal model based on temporal convolutional network. Our approach achieves the state-of-the-art performances on both temporal action proposal task and temporal action localization task.

Results

Task	Dataset	Metric	Value	Model
Video	ActivityNet-1.3	AR@100	73.01	Lin et al.
Video	ActivityNet-1.3	AUC (test)	64.8	Lin et al.
Video	ActivityNet-1.3	AUC (val)	64.4	Lin et al.
Temporal Action Localization	ActivityNet-1.3	AR@100	73.01	Lin et al.
Temporal Action Localization	ActivityNet-1.3	AUC (test)	64.8	Lin et al.
Temporal Action Localization	ActivityNet-1.3	AUC (val)	64.4	Lin et al.
Zero-Shot Learning	ActivityNet-1.3	AR@100	73.01	Lin et al.
Zero-Shot Learning	ActivityNet-1.3	AUC (test)	64.8	Lin et al.
Zero-Shot Learning	ActivityNet-1.3	AUC (val)	64.4	Lin et al.
Action Localization	ActivityNet-1.3	AR@100	73.01	Lin et al.
Action Localization	ActivityNet-1.3	AUC (test)	64.8	Lin et al.
Action Localization	ActivityNet-1.3	AUC (val)	64.4	Lin et al.

Related Papers

DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16 Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23 SurgBench: A Unified Large-Scale Benchmark for Surgical Video Analysis2025-06-09 From Play to Replay: Composed Video Retrieval for Temporally Fine-Grained Videos2025-06-05 Zero-Shot Temporal Interaction Localization for Egocentric Videos2025-06-04 A Review on Coarse to Fine-Grained Animal Action Recognition2025-06-01 LLM-powered Query Expansion for Enhancing Boundary Prediction in Language-driven Action Localization2025-05-30 Spatio-Temporal Joint Density Driven Learning for Skeleton-Based Action Recognition2025-05-29