TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Temporally Distributed Networks for Fast Video Semantic Se...

Temporally Distributed Networks for Fast Video Semantic Segmentation

Ping Hu, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Stan Sclaroff, Federico Perazzi

2020-04-03CVPR 2020 6Real-Time Semantic SegmentationSegmentationSemantic SegmentationVideo Semantic SegmentationKnowledge Distillation
PaperPDFCode

Abstract

We present TDNet, a temporally distributed network designed for fast and accurate video semantic segmentation. We observe that features extracted from a certain high-level layer of a deep CNN can be approximated by composing features extracted from several shallower sub-networks. Leveraging the inherent temporal continuity in videos, we distribute these sub-networks over sequential frames. Therefore, at each time step, we only need to perform a lightweight computation to extract a sub-features group from a single sub-network. The full features used for segmentation are then recomposed by application of a novel attention propagation module that compensates for geometry deformation between frames. A grouped knowledge distillation loss is also introduced to further improve the representation power at both full and sub-feature levels. Experiments on Cityscapes, CamVid, and NYUD-v2 demonstrate that our method achieves state-of-the-art accuracy with significantly faster speed and lower latency.

Results

TaskDatasetMetricValueModel
Scene ParsingCityscapes valmIoU79.9TDNet-50 [9]
Scene ParsingCamVidMean IoU76.2TDNet-50
Semantic SegmentationNYU Depth v2Mean IoU43.5TD2-PSP50
Semantic SegmentationNYU Depth v2Mean IoU37.4TD4-PSP18
Semantic SegmentationUrbanLFmIoU (Real)76.48TDNet (ResNet-50)
Semantic SegmentationUrbanLFmIoU (Syn)74.71TDNet (ResNet-50)
Semantic SegmentationCityscapes testTime (ms)21TD4-BISE18
Semantic SegmentationCamVidTime (ms)90TD2-PSP50
Semantic SegmentationCamVidmIoU76TD2-PSP50
Semantic SegmentationCamVidTime (ms)40TD4-PSP18
Semantic SegmentationCamVidmIoU72.6TD4-PSP18
Semantic SegmentationNYU Depth v2Speed(ms/f)35TD2-PSP50
Semantic SegmentationNYU Depth v2mIoU43.5TD2-PSP50
Semantic SegmentationNYU Depth v2Speed(ms/f)19TD4-PSP18
Semantic SegmentationNYU Depth v2mIoU37.4TD4-PSP18
Video Semantic SegmentationCityscapes valmIoU79.9TDNet-50 [9]
Video Semantic SegmentationCamVidMean IoU76.2TDNet-50
Scene UnderstandingCityscapes valmIoU79.9TDNet-50 [9]
Scene UnderstandingCamVidMean IoU76.2TDNet-50
2D Semantic SegmentationCityscapes valmIoU79.9TDNet-50 [9]
2D Semantic SegmentationCamVidMean IoU76.2TDNet-50
10-shot image generationNYU Depth v2Mean IoU43.5TD2-PSP50
10-shot image generationNYU Depth v2Mean IoU37.4TD4-PSP18
10-shot image generationUrbanLFmIoU (Real)76.48TDNet (ResNet-50)
10-shot image generationUrbanLFmIoU (Syn)74.71TDNet (ResNet-50)
10-shot image generationCityscapes testTime (ms)21TD4-BISE18
10-shot image generationCamVidTime (ms)90TD2-PSP50
10-shot image generationCamVidmIoU76TD2-PSP50
10-shot image generationCamVidTime (ms)40TD4-PSP18
10-shot image generationCamVidmIoU72.6TD4-PSP18
10-shot image generationNYU Depth v2Speed(ms/f)35TD2-PSP50
10-shot image generationNYU Depth v2mIoU43.5TD2-PSP50
10-shot image generationNYU Depth v2Speed(ms/f)19TD4-PSP18
10-shot image generationNYU Depth v2mIoU37.4TD4-PSP18

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17