TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Diffusion Action Segmentation

Diffusion Action Segmentation

Daochang Liu, Qiyue Li, AnhDung Dinh, Tingting Jiang, Mubarak Shah, Chang Xu

2023-03-31ICCV 2023 1DenoisingAction SegmentationTemporal Action SegmentationSegmentation
PaperPDFCode

Abstract

Temporal action segmentation is crucial for understanding long-form videos. Previous works on this task commonly adopt an iterative refinement paradigm by using multi-stage models. We propose a novel framework via denoising diffusion models, which nonetheless shares the same inherent spirit of such iterative refinement. In this framework, action predictions are iteratively generated from random noise with input video features as conditions. To enhance the modeling of three striking characteristics of human actions, including the position prior, the boundary ambiguity, and the relational dependency, we devise a unified masking strategy for the conditioning inputs in our framework. Extensive experiments on three benchmark datasets, i.e., GTEA, 50Salads, and Breakfast, are performed and the proposed method achieves superior or comparable results to state-of-the-art methods, showing the effectiveness of a generative approach for action segmentation.

Results

TaskDatasetMetricValueModel
Action Localization50 SaladsAcc88.9DiffAct
Action Localization50 SaladsEdit85DiffAct
Action Localization50 SaladsF1@10%90.1DiffAct
Action Localization50 SaladsF1@25%89.2DiffAct
Action Localization50 SaladsF1@50%83.7DiffAct
Action LocalizationGTEAAcc82.2DiffAct
Action LocalizationGTEAEdit89.6DiffAct
Action LocalizationGTEAF1@10%92.5DiffAct
Action LocalizationGTEAF1@25%91.5DiffAct
Action LocalizationGTEAF1@50%84.7DiffAct
Action LocalizationBreakfastAcc76.4DiffAct
Action LocalizationBreakfastAverage F173.6DiffAct
Action LocalizationBreakfastEdit78.4DiffAct
Action LocalizationBreakfastF1@10%80.3DiffAct
Action LocalizationBreakfastF1@25%75.9DiffAct
Action LocalizationBreakfastF1@50%64.6DiffAct
Action Segmentation50 SaladsAcc88.9DiffAct
Action Segmentation50 SaladsEdit85DiffAct
Action Segmentation50 SaladsF1@10%90.1DiffAct
Action Segmentation50 SaladsF1@25%89.2DiffAct
Action Segmentation50 SaladsF1@50%83.7DiffAct
Action SegmentationGTEAAcc82.2DiffAct
Action SegmentationGTEAEdit89.6DiffAct
Action SegmentationGTEAF1@10%92.5DiffAct
Action SegmentationGTEAF1@25%91.5DiffAct
Action SegmentationGTEAF1@50%84.7DiffAct
Action SegmentationBreakfastAcc76.4DiffAct
Action SegmentationBreakfastAverage F173.6DiffAct
Action SegmentationBreakfastEdit78.4DiffAct
Action SegmentationBreakfastF1@10%80.3DiffAct
Action SegmentationBreakfastF1@25%75.9DiffAct
Action SegmentationBreakfastF1@50%64.6DiffAct

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17