TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Coarse to Fine Multi-Resolution Temporal Convolutional Net...

Coarse to Fine Multi-Resolution Temporal Convolutional Network

Dipika Singhania, Rahul Rahaman, Angela Yao

2021-05-23Action SegmentationSegmentationVideo SegmentationVideo Semantic Segmentation
PaperPDFCode(official)

Abstract

Temporal convolutional networks (TCNs) are a commonly used architecture for temporal video segmentation. TCNs however, tend to suffer from over-segmentation errors and require additional refinement modules to ensure smoothness and temporal coherency. In this work, we propose a novel temporal encoder-decoder to tackle the problem of sequence fragmentation. In particular, the decoder follows a coarse-to-fine structure with an implicit ensemble of multiple temporal resolutions. The ensembling produces smoother segmentations that are more accurate and better-calibrated, bypassing the need for additional refinement modules. In addition, we enhance our training with a multi-resolution feature-augmentation strategy to promote robustness to varying temporal resolutions. Finally, to support our architecture and encourage further sequence coherency, we propose an action loss that penalizes misclassifications at the video level. Experiments show that our stand-alone architecture, together with our novel feature-augmentation strategy and new loss, outperforms the state-of-the-art on three temporal video segmentation benchmarks.

Results

TaskDatasetMetricValueModel
Action Localization50 SaladsAcc84.9C2F-TCN
Action Localization50 SaladsEdit76.4C2F-TCN
Action Localization50 SaladsF1@10%84.3C2F-TCN
Action Localization50 SaladsF1@25%81.8C2F-TCN
Action Localization50 SaladsF1@50%72.6C2F-TCN
Action LocalizationAssembly101Edit32.4C2F-TCN
Action LocalizationAssembly101F1@10%33.3C2F-TCN
Action LocalizationAssembly101F1@25%29C2F-TCN
Action LocalizationAssembly101F1@50%21.3C2F-TCN
Action LocalizationAssembly101MoF39.2C2F-TCN
Action LocalizationGTEAAcc80.8C2F-TCN
Action LocalizationGTEAEdit86.4C2F-TCN
Action LocalizationGTEAF1@10%90.3C2F-TCN
Action LocalizationGTEAF1@25%88.8C2F-TCN
Action LocalizationGTEAF1@50%77.7C2F-TCN
Action LocalizationBreakfastAcc76C2F-TCN
Action LocalizationBreakfastAverage F166.2C2F-TCN
Action LocalizationBreakfastEdit69.6C2F-TCN
Action LocalizationBreakfastF1@10%72.2C2F-TCN
Action LocalizationBreakfastF1@25%68.7C2F-TCN
Action LocalizationBreakfastF1@50%57.6C2F-TCN
Action Segmentation50 SaladsAcc84.9C2F-TCN
Action Segmentation50 SaladsEdit76.4C2F-TCN
Action Segmentation50 SaladsF1@10%84.3C2F-TCN
Action Segmentation50 SaladsF1@25%81.8C2F-TCN
Action Segmentation50 SaladsF1@50%72.6C2F-TCN
Action SegmentationAssembly101Edit32.4C2F-TCN
Action SegmentationAssembly101F1@10%33.3C2F-TCN
Action SegmentationAssembly101F1@25%29C2F-TCN
Action SegmentationAssembly101F1@50%21.3C2F-TCN
Action SegmentationAssembly101MoF39.2C2F-TCN
Action SegmentationGTEAAcc80.8C2F-TCN
Action SegmentationGTEAEdit86.4C2F-TCN
Action SegmentationGTEAF1@10%90.3C2F-TCN
Action SegmentationGTEAF1@25%88.8C2F-TCN
Action SegmentationGTEAF1@50%77.7C2F-TCN
Action SegmentationBreakfastAcc76C2F-TCN
Action SegmentationBreakfastAverage F166.2C2F-TCN
Action SegmentationBreakfastEdit69.6C2F-TCN
Action SegmentationBreakfastF1@10%72.2C2F-TCN
Action SegmentationBreakfastF1@25%68.7C2F-TCN
Action SegmentationBreakfastF1@50%57.6C2F-TCN

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17