TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Unsupervised learning of action classes with continuous te...

Unsupervised learning of action classes with continuous temporal embedding

Anna Kukleva, Hilde Kuehne, Fadime Sener, Juergen Gall

2019-04-08CVPR 2019 6Action SegmentationUnsupervised Action Segmentation
PaperPDFCodeCode(official)

Abstract

The task of temporally detecting and segmenting actions in untrimmed videos has seen an increased attention recently. One problem in this context arises from the need to define and label action boundaries to create annotations for training which is very time and cost intensive. To address this issue, we propose an unsupervised approach for learning action classes from untrimmed video sequences. To this end, we use a continuous temporal embedding of framewise features to benefit from the sequential nature of activities. Based on the latent space created by the embedding, we identify clusters of temporal segments across all videos that correspond to semantic meaningful action classes. The approach is evaluated on three challenging datasets, namely the Breakfast dataset, YouTube Instructions, and the 50Salads dataset. While previous works assumed that the videos contain the same high level activity, we furthermore show that the proposed approach can also be applied to a more general setting where the content of the videos is unknown.

Results

TaskDatasetMetricValueModel
Action LocalizationIKEA ASMAccuracy23.1CTE
Action LocalizationIKEA ASMF122.6CTE
Action LocalizationIKEA ASMJSD73.7CTE
Action LocalizationIKEA ASMPrecision28.1CTE
Action LocalizationIKEA ASMRecall18.9CTE
Action LocalizationYoutube INRIA InstructionalAcc39CTE
Action LocalizationYoutube INRIA InstructionalF128.3CTE
Action LocalizationYoutube INRIA InstructionalPrecision39.3CTE
Action LocalizationYoutube INRIA InstructionalRecall22.1CTE
Action LocalizationBreakfastAcc41.8CTE
Action LocalizationBreakfastF126.4CTE
Action LocalizationBreakfastJSD87.4CTE
Action LocalizationBreakfastPrecision25.8CTE
Action LocalizationBreakfastRecall27CTE
Action SegmentationIKEA ASMAccuracy23.1CTE
Action SegmentationIKEA ASMF122.6CTE
Action SegmentationIKEA ASMJSD73.7CTE
Action SegmentationIKEA ASMPrecision28.1CTE
Action SegmentationIKEA ASMRecall18.9CTE
Action SegmentationYoutube INRIA InstructionalAcc39CTE
Action SegmentationYoutube INRIA InstructionalF128.3CTE
Action SegmentationYoutube INRIA InstructionalPrecision39.3CTE
Action SegmentationYoutube INRIA InstructionalRecall22.1CTE
Action SegmentationBreakfastAcc41.8CTE
Action SegmentationBreakfastF126.4CTE
Action SegmentationBreakfastJSD87.4CTE
Action SegmentationBreakfastPrecision25.8CTE
Action SegmentationBreakfastRecall27CTE

Related Papers

Self-supervised pretraining of vision transformers for animal behavioral analysis and neural encoding2025-07-13HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios2025-06-11EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models2025-06-02M2R2: MulitModal Robotic Representation for Temporal Action Segmentation2025-04-25Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos2025-04-07Towards Generalizing Temporal Action Segmentation to Unseen Views2025-04-03What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning2025-03-27Cost-Sensitive Learning for Long-Tailed Temporal Action Segmentation2025-03-24