TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Segmentation Transformer: Object-Contextual Representation...

Segmentation Transformer: Object-Contextual Representations for Semantic Segmentation

Yuhui Yuan, Xiaokang Chen, Xilin Chen, Jingdong Wang

2019-09-24ECCV 2020 8SegmentationSemantic Segmentation
PaperPDFCodeCodeCodeCodeCodeCodeCode(official)CodeCodeCodeCode

Abstract

In this paper, we address the semantic segmentation problem with a focus on the context aggregation strategy. Our motivation is that the label of a pixel is the category of the object that the pixel belongs to. We present a simple yet effective approach, object-contextual representations, characterizing a pixel by exploiting the representation of the corresponding object class. First, we learn object regions under the supervision of ground-truth segmentation. Second, we compute the object region representation by aggregating the representations of the pixels lying in the object region. Last, % the representation similarity we compute the relation between each pixel and each object region and augment the representation of each pixel with the object-contextual representation which is a weighted aggregation of all the object region representations according to their relations with the pixel. We empirically demonstrate that the proposed approach achieves competitive performance on various challenging semantic segmentation benchmarks: Cityscapes, ADE20K, LIP, PASCAL-Context, and COCO-Stuff. Cityscapes, ADE20K, LIP, PASCAL-Context, and COCO-Stuff. Our submission "HRNet + OCR + SegFix" achieves 1-st place on the Cityscapes leaderboard by the time of submission. Code is available at: https://git.io/openseg and https://git.io/HRNet.OCR. We rephrase the object-contextual representation scheme using the Transformer encoder-decoder framework. The details are presented in~Section3.3.

Results

TaskDatasetMetricValueModel
Semantic SegmentationCityscapes valmIoU83.6HRNetV2 + OCR + RMI (PaddleClas pretrained)
Semantic SegmentationCityscapes valmIoU80.6OCR (ResNet-101-FCN)
Semantic SegmentationBDD100K valmIoU60.1OCRNet
Semantic SegmentationADE20K valmIoU47.98HRNetV2 + OCR + RMI (PaddleClas pretrained)
Semantic SegmentationADE20K valmIoU45.66OCR (HRNetV2-W48)
Semantic SegmentationADE20K valmIoU45.28OCR (ResNet-101)
Semantic SegmentationPASCAL ContextmIoU59.6HRNetV2 + OCR + RMI (PaddleClas pretrained)
Semantic SegmentationPASCAL ContextmIoU56.2OCR (HRNetV2-W48)
Semantic SegmentationPASCAL ContextmIoU54.8OCR (ResNet-101)
Semantic SegmentationADE20KValidation mIoU47.98HRNetV2 + OCR + RMI (PaddleClas pretrained)
Semantic SegmentationADE20KValidation mIoU45.66OCR(HRNetV2-W48)
Semantic SegmentationADE20KValidation mIoU45.28OCR (ResNet-101)
10-shot image generationCityscapes valmIoU83.6HRNetV2 + OCR + RMI (PaddleClas pretrained)
10-shot image generationCityscapes valmIoU80.6OCR (ResNet-101-FCN)
10-shot image generationBDD100K valmIoU60.1OCRNet
10-shot image generationADE20K valmIoU47.98HRNetV2 + OCR + RMI (PaddleClas pretrained)
10-shot image generationADE20K valmIoU45.66OCR (HRNetV2-W48)
10-shot image generationADE20K valmIoU45.28OCR (ResNet-101)
10-shot image generationPASCAL ContextmIoU59.6HRNetV2 + OCR + RMI (PaddleClas pretrained)
10-shot image generationPASCAL ContextmIoU56.2OCR (HRNetV2-W48)
10-shot image generationPASCAL ContextmIoU54.8OCR (ResNet-101)
10-shot image generationADE20KValidation mIoU47.98HRNetV2 + OCR + RMI (PaddleClas pretrained)
10-shot image generationADE20KValidation mIoU45.66OCR(HRNetV2-W48)
10-shot image generationADE20KValidation mIoU45.28OCR (ResNet-101)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17