Fast Weakly Supervised Action Segmentation Using Mutual Consistency

Yaser Souri, Mohsen Fayyaz, Luca Minciullo, Gianpiero Francesca, Juergen Gall

2019-04-05Action Segmentation Weakly Supervised Action Segmentation (Transcript)Segmentation

Abstract

Action segmentation is the task of predicting the actions for each frame of a video. As obtaining the full annotation of videos for action segmentation is expensive, weakly supervised approaches that can learn only from transcripts are appealing. In this paper, we propose a novel end-to-end approach for weakly supervised action segmentation based on a two-branch neural network. The two branches of our network predict two redundant but different representations for action segmentation and we propose a novel mutual consistency (MuCon) loss that enforces the consistency of the two redundant representations. Using the MuCon loss together with a loss for transcript prediction, our proposed approach achieves the accuracy of state-of-the-art approaches while being $14$ times faster to train and $20$ times faster during inference. The MuCon loss proves beneficial even in the fully supervised setting.

Results

Task	Dataset	Metric	Value	Model
Action Localization	Breakfast	Acc	62.8	MuCon
Action Localization	Breakfast	Average F1	62.6	MuCon
Action Localization	Breakfast	Edit	76.3	MuCon
Action Localization	Breakfast	F1@10%	73.2	MuCon
Action Localization	Breakfast	F1@25%	66.1	MuCon
Action Localization	Breakfast	F1@50%	48.4	MuCon
Action Localization	Breakfast	Acc	48.5	MuCon
Action Segmentation	Breakfast	Acc	62.8	MuCon
Action Segmentation	Breakfast	Average F1	62.6	MuCon
Action Segmentation	Breakfast	Edit	76.3	MuCon
Action Segmentation	Breakfast	F1@10%	73.2	MuCon
Action Segmentation	Breakfast	F1@25%	66.1	MuCon
Action Segmentation	Breakfast	F1@50%	48.4	MuCon
Action Segmentation	Breakfast	Acc	48.5	MuCon

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21 Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17 DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17 From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17 Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17 SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17 Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17 A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17