TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Temporal Memory Attention for Video Semantic Segmentation

Temporal Memory Attention for Video Semantic Segmentation

Hao Wang, Weining Wang, Jing Liu

2021-02-17SegmentationSemantic SegmentationVideo Semantic Segmentation
PaperPDFCode(official)

Abstract

Video semantic segmentation requires to utilize the complex temporal relations between frames of the video sequence. Previous works usually exploit accurate optical flow to leverage the temporal relations, which suffer much from heavy computational cost. In this paper, we propose a Temporal Memory Attention Network (TMANet) to adaptively integrate the long-range temporal relations over the video sequence based on the self-attention mechanism without exhaustive optical flow prediction. Specially, we construct a memory using several past frames to store the temporal information of the current frame. We then propose a temporal memory attention module to capture the relation between the current frame and the memory to enhance the representation of the current frame. Our method achieves new state-of-the-art performances on two challenging video semantic segmentation datasets, particularly 80.3% mIoU on Cityscapes and 76.5% mIoU on CamVid with ResNet-50.

Results

TaskDatasetMetricValueModel
Scene ParsingCityscapes valmIoU80.3TMANet-50
Scene ParsingCamVidMean IoU76.5TMANet-50
Scene ParsingCamVidMean IoU74.7Netwarp
Semantic SegmentationUrbanLFmIoU (Real)77.14TMANet
Semantic SegmentationUrbanLFmIoU (Syn)76.41TMANet
Video Semantic SegmentationCityscapes valmIoU80.3TMANet-50
Video Semantic SegmentationCamVidMean IoU76.5TMANet-50
Video Semantic SegmentationCamVidMean IoU74.7Netwarp
Scene UnderstandingCityscapes valmIoU80.3TMANet-50
Scene UnderstandingCamVidMean IoU76.5TMANet-50
Scene UnderstandingCamVidMean IoU74.7Netwarp
2D Semantic SegmentationCityscapes valmIoU80.3TMANet-50
2D Semantic SegmentationCamVidMean IoU76.5TMANet-50
2D Semantic SegmentationCamVidMean IoU74.7Netwarp
10-shot image generationUrbanLFmIoU (Real)77.14TMANet
10-shot image generationUrbanLFmIoU (Syn)76.41TMANet

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17