Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation

Ming Xu, Stephen Gould

2024-04-01CVPR 2024 1Action Segmentation Unsupervised Action Segmentation Segmentation

Abstract

We propose a novel approach to the action segmentation task for long, untrimmed videos, based on solving an optimal transport problem. By encoding a temporal consistency prior into a Gromov-Wasserstein problem, we are able to decode a temporally consistent segmentation from a noisy affinity/matching cost matrix between video frames and action classes. Unlike previous approaches, our method does not require knowing the action order for a video to attain temporal consistency. Furthermore, our resulting (fused) Gromov-Wasserstein problem can be efficiently solved on GPUs using a few iterations of projected mirror descent. We demonstrate the effectiveness of our method in an unsupervised learning setting, where our method is used to generate pseudo-labels for self-training. We evaluate our segmentation approach and unsupervised learning pipeline on the Breakfast, 50-Salads, YouTube Instructions and Desktop Assembly datasets, yielding state-of-the-art results for the unsupervised video action segmentation task.

Results

Task	Dataset	Metric	Value	Model
Action Localization	IKEA ASM	Accuracy	34	ASOT
Action Localization	IKEA ASM	F1	27.9	ASOT
Action Localization	IKEA ASM	JSD	88.7	ASOT
Action Localization	IKEA ASM	Precision	21.1	ASOT
Action Localization	IKEA ASM	Recall	24	ASOT
Action Localization	Youtube INRIA Instructional	Acc	52.9	ASOT
Action Localization	Youtube INRIA Instructional	F1	35.1	ASOT
Action Localization	Youtube INRIA Instructional	Precision	47.6	ASOT
Action Localization	Youtube INRIA Instructional	Recall	27.8	ASOT
Action Localization	Youtube INRIA Instructional	mIoU	24.7	ASOT
Action Localization	Breakfast	Acc	56.1	ASOT
Action Localization	Breakfast	F1	38.3	ASOT
Action Localization	Breakfast	JSD	94.9	ASOT
Action Localization	Breakfast	Precision	36.7	ASOT
Action Localization	Breakfast	Recall	40.1	ASOT
Action Localization	Breakfast	mIoU	18.6	ASOT
Action Segmentation	IKEA ASM	Accuracy	34	ASOT
Action Segmentation	IKEA ASM	F1	27.9	ASOT
Action Segmentation	IKEA ASM	JSD	88.7	ASOT
Action Segmentation	IKEA ASM	Precision	21.1	ASOT
Action Segmentation	IKEA ASM	Recall	24	ASOT
Action Segmentation	Youtube INRIA Instructional	Acc	52.9	ASOT
Action Segmentation	Youtube INRIA Instructional	F1	35.1	ASOT
Action Segmentation	Youtube INRIA Instructional	Precision	47.6	ASOT
Action Segmentation	Youtube INRIA Instructional	Recall	27.8	ASOT
Action Segmentation	Youtube INRIA Instructional	mIoU	24.7	ASOT
Action Segmentation	Breakfast	Acc	56.1	ASOT
Action Segmentation	Breakfast	F1	38.3	ASOT
Action Segmentation	Breakfast	JSD	94.9	ASOT
Action Segmentation	Breakfast	Precision	36.7	ASOT
Action Segmentation	Breakfast	Recall	40.1	ASOT
Action Segmentation	Breakfast	mIoU	18.6	ASOT

Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation

Abstract

Results

Related Papers

Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation

Abstract

Results

Related Papers