TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Temporal Convolutional Networks for Action Segmentation an...

Temporal Convolutional Networks for Action Segmentation and Detection

Colin Lea, Michael D. Flynn, Rene Vidal, Austin Reiter, Gregory D. Hager

2016-11-16CVPR 2017 7Action SegmentationSkeleton Based Action Recognition
PaperPDFCodeCodeCodeCodeCode

Abstract

The ability to identify and temporally segment fine-grained human actions throughout a video is crucial for robotics, surveillance, education, and beyond. Typical approaches decouple this problem by first extracting local spatiotemporal features from video frames and then feeding them into a temporal classifier that captures high-level temporal patterns. We introduce a new class of temporal models, which we call Temporal Convolutional Networks (TCNs), that use a hierarchy of temporal convolutions to perform fine-grained action segmentation or detection. Our Encoder-Decoder TCN uses pooling and upsampling to efficiently capture long-range temporal patterns whereas our Dilated TCN uses dilated convolutions. We show that TCNs are capable of capturing action compositions, segment durations, and long-range dependencies, and are over a magnitude faster to train than competing LSTM-based Recurrent Neural Networks. We apply these models to three challenging fine-grained datasets and show large improvements over the state of the art.

Results

TaskDatasetMetricValueModel
Action LocalizationGTEAAcc64ED-TCN
Action LocalizationGTEAF1@10%72.2ED-TCN
Action LocalizationGTEAF1@25%69.3ED-TCN
Action LocalizationGTEAF1@50%56ED-TCN
Action SegmentationGTEAAcc64ED-TCN
Action SegmentationGTEAF1@10%72.2ED-TCN
Action SegmentationGTEAF1@25%69.3ED-TCN
Action SegmentationGTEAF1@50%56ED-TCN

Related Papers

Self-supervised pretraining of vision transformers for animal behavioral analysis and neural encoding2025-07-13Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios2025-06-11EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models2025-06-023D Skeleton-Based Action Recognition: A Review2025-06-01Spatio-Temporal Joint Density Driven Learning for Skeleton-Based Action Recognition2025-05-29M2R2: MulitModal Robotic Representation for Temporal Action Segmentation2025-04-25