TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Leveraging triplet loss for unsupervised action segmentation

Leveraging triplet loss for unsupervised action segmentation

E. Bueno-Benito, B. Tura, M. Dimiccoli

2023-04-13Action SegmentationUnsupervised Action SegmentationMetric LearningSegmentationClusteringVideo Understanding
PaperPDFCode(official)

Abstract

In this paper, we propose a novel fully unsupervised framework that learns action representations suitable for the action segmentation task from the single input video itself, without requiring any training data. Our method is a deep metric learning approach rooted in a shallow network with a triplet loss operating on similarity distributions and a novel triplet selection strategy that effectively models temporal and semantic priors to discover actions in the new representational space. Under these circumstances, we successfully recover temporal boundaries in the learned action representations with higher quality compared with existing unsupervised approaches. The proposed method is evaluated on two widely used benchmark datasets for the action segmentation task and it achieves competitive performance by applying a generic clustering algorithm on the learned representations.

Results

TaskDatasetMetricValueModel
Action LocalizationBreakfastAcc65.1TSA (FINCH)
Action LocalizationBreakfastmIoU52.1TSA (FINCH)
Action LocalizationBreakfastAcc63.7TSA (Kmeans)
Action LocalizationBreakfastF158TSA (Kmeans)
Action LocalizationBreakfastmIoU53.3TSA (Kmeans)
Action LocalizationBreakfastAcc63.2TSA (Spectral)
Action LocalizationBreakfastF157.8TSA (Spectral)
Action LocalizationBreakfastmIoU52.7TSA (Spectral)
Action LocalizationYoutube INRIA InstructionalAcc62.4TSA (FINCH)
Action LocalizationYoutube INRIA InstructionalF154.7TSA (FINCH)
Action LocalizationYoutube INRIA InstructionalAcc59.7TSA (Kmeans)
Action LocalizationYoutube INRIA InstructionalF155.3TSA (Kmeans)
Action SegmentationBreakfastAcc65.1TSA (FINCH)
Action SegmentationBreakfastmIoU52.1TSA (FINCH)
Action SegmentationBreakfastAcc63.7TSA (Kmeans)
Action SegmentationBreakfastF158TSA (Kmeans)
Action SegmentationBreakfastmIoU53.3TSA (Kmeans)
Action SegmentationBreakfastAcc63.2TSA (Spectral)
Action SegmentationBreakfastF157.8TSA (Spectral)
Action SegmentationBreakfastmIoU52.7TSA (Spectral)
Action SegmentationYoutube INRIA InstructionalAcc62.4TSA (FINCH)
Action SegmentationYoutube INRIA InstructionalF154.7TSA (FINCH)
Action SegmentationYoutube INRIA InstructionalAcc59.7TSA (Kmeans)
Action SegmentationYoutube INRIA InstructionalF155.3TSA (Kmeans)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Tri-Learn Graph Fusion Network for Attributed Graph Clustering2025-07-18Unsupervised Ground Metric Learning2025-07-17Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17