TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Alleviating Over-segmentation Errors by Detecting Action B...

Alleviating Over-segmentation Errors by Detecting Action Boundaries

Yuchi Ishikawa, Seito Kasai, Yoshimitsu Aoki, Hirokatsu Kataoka

2020-07-14Action SegmentationAction ClassificationregressionTemporal Action SegmentationSegmentation
PaperPDFCode(official)

Abstract

We propose an effective framework for the temporal action segmentation task, namely an Action Segment Refinement Framework (ASRF). Our model architecture consists of a long-term feature extractor and two branches: the Action Segmentation Branch (ASB) and the Boundary Regression Branch (BRB). The long-term feature extractor provides shared features for the two branches with a wide temporal receptive field. The ASB classifies video frames with action classes, while the BRB regresses the action boundary probabilities. The action boundaries predicted by the BRB refine the output from the ASB, which results in a significant performance improvement. Our contributions are three-fold: (i) We propose a framework for temporal action segmentation, the ASRF, which divides temporal action segmentation into frame-wise action classification and action boundary regression. Our framework refines frame-level hypotheses of action classes using predicted action boundaries. (ii) We propose a loss function for smoothing the transition of action probabilities, and analyze combinations of various loss functions for temporal action segmentation. (iii) Our framework outperforms state-of-the-art methods on three challenging datasets, offering an improvement of up to 13.7% in terms of segmental edit distance and up to 16.1% in terms of segmental F1 score. Our code will be publicly available soon.

Results

TaskDatasetMetricValueModel
Action Localization50 SaladsAcc84.5ASRF
Action Localization50 SaladsEdit79.3ASRF
Action Localization50 SaladsF1@10%84.9ASRF
Action Localization50 SaladsF1@25%83.5ASRF
Action Localization50 SaladsF1@50%77.3ASRF
Action LocalizationGTEAAcc77.3ASRF
Action LocalizationGTEAEdit83.7ASRF
Action LocalizationGTEAF1@10%89.4ASRF
Action LocalizationGTEAF1@25%87.8ASRF
Action LocalizationGTEAF1@50%79.8ASRF
Action LocalizationBreakfastAcc67.6ASRF
Action LocalizationBreakfastAverage F166.4ASRF
Action LocalizationBreakfastEdit72.4ASRF
Action LocalizationBreakfastF1@10%74.3ASRF
Action LocalizationBreakfastF1@25%68.9ASRF
Action LocalizationBreakfastF1@50%56.1ASRF
Action Segmentation50 SaladsAcc84.5ASRF
Action Segmentation50 SaladsEdit79.3ASRF
Action Segmentation50 SaladsF1@10%84.9ASRF
Action Segmentation50 SaladsF1@25%83.5ASRF
Action Segmentation50 SaladsF1@50%77.3ASRF
Action SegmentationGTEAAcc77.3ASRF
Action SegmentationGTEAEdit83.7ASRF
Action SegmentationGTEAF1@10%89.4ASRF
Action SegmentationGTEAF1@25%87.8ASRF
Action SegmentationGTEAF1@50%79.8ASRF
Action SegmentationBreakfastAcc67.6ASRF
Action SegmentationBreakfastAverage F166.4ASRF
Action SegmentationBreakfastEdit72.4ASRF
Action SegmentationBreakfastF1@10%74.3ASRF
Action SegmentationBreakfastF1@25%68.9ASRF
Action SegmentationBreakfastF1@50%56.1ASRF

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17