TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Efficient Temporal Action Segmentation via Boundary-aware ...

Efficient Temporal Action Segmentation via Boundary-aware Query Voting

Peiyao Wang, Yuewei Lin, Erik Blasch, Jie Wei, Haibin Ling

2024-05-25Action SegmentationTemporal Action SegmentationSegmentationSemantic SegmentationInstance Segmentation
PaperPDFCode(official)

Abstract

Although the performance of Temporal Action Segmentation (TAS) has improved in recent years, achieving promising results often comes with a high computational cost due to dense inputs, complex model structures, and resource-intensive post-processing requirements. To improve the efficiency while keeping the performance, we present a novel perspective centered on per-segment classification. By harnessing the capabilities of Transformers, we tokenize each video segment as an instance token, endowed with intrinsic instance segmentation. To realize efficient action segmentation, we introduce BaFormer, a boundary-aware Transformer network. It employs instance queries for instance segmentation and a global query for class-agnostic boundary prediction, yielding continuous segment proposals. During inference, BaFormer employs a simple yet effective voting strategy to classify boundary-wise segments based on instance segmentation. Remarkably, as a single-stage approach, BaFormer significantly reduces the computational costs, utilizing only 6% of the running time compared to state-of-the-art method DiffAct, while producing better or comparable accuracy over several popular benchmarks. The code for this project is publicly available at https://github.com/peiyao-w/BaFormer.

Results

TaskDatasetMetricValueModel
Action Localization50 SaladsAcc89.5BaFormer
Action Localization50 SaladsEdit84.2BaFormer
Action Localization50 SaladsF1@10%89.3BaFormer
Action Localization50 SaladsF1@25%88.4BaFormer
Action Localization50 SaladsF1@50%83.9BaFormer
Action LocalizationGTEAAcc83BaFormer
Action LocalizationGTEAEdit88.7BaFormer
Action LocalizationGTEAF1@10%92BaFormer
Action LocalizationGTEAF1@25%91.3BaFormer
Action LocalizationGTEAF1@50%83.5BaFormer
Action LocalizationBreakfastAcc76.6BaFormer
Action LocalizationBreakfastAverage F172.4BaFormer
Action LocalizationBreakfastEdit77.3BaFormer
Action LocalizationBreakfastF1@10%79.2BaFormer
Action LocalizationBreakfastF1@25%74.9BaFormer
Action LocalizationBreakfastF1@50%63.2BaFormer
Action Segmentation50 SaladsAcc89.5BaFormer
Action Segmentation50 SaladsEdit84.2BaFormer
Action Segmentation50 SaladsF1@10%89.3BaFormer
Action Segmentation50 SaladsF1@25%88.4BaFormer
Action Segmentation50 SaladsF1@50%83.9BaFormer
Action SegmentationGTEAAcc83BaFormer
Action SegmentationGTEAEdit88.7BaFormer
Action SegmentationGTEAF1@10%92BaFormer
Action SegmentationGTEAF1@25%91.3BaFormer
Action SegmentationGTEAF1@50%83.5BaFormer
Action SegmentationBreakfastAcc76.6BaFormer
Action SegmentationBreakfastAverage F172.4BaFormer
Action SegmentationBreakfastEdit77.3BaFormer
Action SegmentationBreakfastF1@10%79.2BaFormer
Action SegmentationBreakfastF1@25%74.9BaFormer
Action SegmentationBreakfastF1@50%63.2BaFormer

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17