TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Temporal Convolutional Networks: A Unified Approach to Act...

Temporal Convolutional Networks: A Unified Approach to Action Segmentation

Colin Lea, Rene Vidal, Austin Reiter, Gregory D. Hager

2016-08-29Action SegmentationSegmentation
PaperPDFCode

Abstract

The dominant paradigm for video-based action segmentation is composed of two steps: first, for each frame, compute low-level features using Dense Trajectories or a Convolutional Neural Network that encode spatiotemporal information locally, and second, input these features into a classifier that captures high-level temporal relationships, such as a Recurrent Neural Network (RNN). While often effective, this decoupling requires specifying two separate models, each with their own complexities, and prevents capturing more nuanced long-range spatiotemporal relationships. We propose a unified approach, as demonstrated by our Temporal Convolutional Network (TCN), that hierarchically captures relationships at low-, intermediate-, and high-level time-scales. Our model achieves superior or competitive performance using video or sensor data on three public action segmentation datasets and can be trained in a fraction of the time it takes to train an RNN.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+DAccuracy (CV)83.1TCN
Temporal Action LocalizationNTU RGB+DAccuracy (CV)83.1TCN
Zero-Shot LearningNTU RGB+DAccuracy (CV)83.1TCN
Activity RecognitionNTU RGB+DAccuracy (CV)83.1TCN
Action LocalizationNTU RGB+DAccuracy (CV)83.1TCN
Action LocalizationJIGSAWSAccuracy81.4TCN
Action LocalizationJIGSAWSEdit Distance83.1TCN
Action DetectionNTU RGB+DAccuracy (CV)83.1TCN
3D Action RecognitionNTU RGB+DAccuracy (CV)83.1TCN
Action RecognitionNTU RGB+DAccuracy (CV)83.1TCN
Action SegmentationJIGSAWSAccuracy81.4TCN
Action SegmentationJIGSAWSEdit Distance83.1TCN

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17