TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/PolyFormer: Referring Image Segmentation as Sequential Pol...

PolyFormer: Referring Image Segmentation as Sequential Polygon Generation

Jiang Liu, Hui Ding, Zhaowei Cai, Yuting Zhang, Ravi Kumar Satzoda, Vijay Mahadevan, R. Manmatha

2023-02-14CVPR 2023 1QuantizationReferring Expression ComprehensionReferring Expression SegmentationSegmentationSemantic SegmentationVideo SegmentationVideo Semantic SegmentationImage Segmentation
PaperPDFCode(official)

Abstract

In this work, instead of directly predicting the pixel-level segmentation masks, the problem of referring image segmentation is formulated as sequential polygon generation, and the predicted polygons can be later converted into segmentation masks. This is enabled by a new sequence-to-sequence framework, Polygon Transformer (PolyFormer), which takes a sequence of image patches and text query tokens as input, and outputs a sequence of polygon vertices autoregressively. For more accurate geometric localization, we propose a regression-based decoder, which predicts the precise floating-point coordinates directly, without any coordinate quantization error. In the experiments, PolyFormer outperforms the prior art by a clear margin, e.g., 5.40% and 4.52% absolute improvements on the challenging RefCOCO+ and RefCOCOg datasets. It also shows strong generalization ability when evaluated on the referring video segmentation task without fine-tuning, e.g., achieving competitive 61.5% J&F on the Ref-DAVIS17 dataset.

Results

TaskDatasetMetricValueModel
Instance SegmentationRefCoCo valMean IoU76.94PolyFormer-L
Instance SegmentationRefCoCo valOverall IoU75.96PolyFormer-L
Instance SegmentationRefCoCo valOverall IoU74.82PolyFormer-B
Instance SegmentationRefCOCOg-testMean IoU71.17PolyFormer-L
Instance SegmentationRefCOCOg-testOverall IoU70.19PolyFormer-L
Instance SegmentationRefCOCOg-testMean IoU69.88PolyFormer-B
Instance SegmentationRefCOCOg-testOverall IoU69.05PolyFormer-B
Instance SegmentationRefCOCO+ valMean IoU72.15PolyFormer-L
Instance SegmentationRefCOCO+ valOverall IoU69.33PolyFormer-L
Instance SegmentationRefCOCO+ valMean IoU70.65PolyFormer-B
Instance SegmentationRefCOCO+ valOverall IoU67.64PolyFormer-B
Instance SegmentationRefCOCO+ test BMean IoU66.73PolyFormer-L
Instance SegmentationRefCOCO+ test BOverall IoU61.87PolyFormer-L
Instance SegmentationRefCOCO+ test BMean IoU64.64PolyFormer-B
Instance SegmentationRefCOCO+ test BOverall IoU59.33PolyFormer-B
Instance SegmentationDAVIS 2017 (val)J&F 1st frame60.9PolyFormer-B
Instance SegmentationRefCOCO+ testAMean IoU75.71PolyFormer-L
Instance SegmentationRefCOCO+ testAOverall IoU74.56PolyFormer-L
Instance SegmentationRefCOCO+ testAMean IoU74.51PolyFormer-B
Instance SegmentationRefCOCO+ testAOverall IoU72.89PolyFormer-B
Instance SegmentationReferItMean IoU67.22PolyFormer-L
Instance SegmentationReferItOverall IoU72.6PolyFormer-L
Instance SegmentationReferItMean IoU65.98PolyFormer-B
Instance SegmentationReferItOverall IoU71.91PolyFormer-B
Instance SegmentationRefCOCOg-valMean IoU71.15PolyFormer-L
Instance SegmentationRefCOCOg-valOverall IoU69.2PolyFormer-L
Instance SegmentationRefCOCOg-valMean IoU69.36PolyFormer-B
Instance SegmentationRefCOCOg-valOverall IoU67.76PolyFormer-B
Referring Expression SegmentationRefCoCo valMean IoU76.94PolyFormer-L
Referring Expression SegmentationRefCoCo valOverall IoU75.96PolyFormer-L
Referring Expression SegmentationRefCoCo valOverall IoU74.82PolyFormer-B
Referring Expression SegmentationRefCOCOg-testMean IoU71.17PolyFormer-L
Referring Expression SegmentationRefCOCOg-testOverall IoU70.19PolyFormer-L
Referring Expression SegmentationRefCOCOg-testMean IoU69.88PolyFormer-B
Referring Expression SegmentationRefCOCOg-testOverall IoU69.05PolyFormer-B
Referring Expression SegmentationRefCOCO+ valMean IoU72.15PolyFormer-L
Referring Expression SegmentationRefCOCO+ valOverall IoU69.33PolyFormer-L
Referring Expression SegmentationRefCOCO+ valMean IoU70.65PolyFormer-B
Referring Expression SegmentationRefCOCO+ valOverall IoU67.64PolyFormer-B
Referring Expression SegmentationRefCOCO+ test BMean IoU66.73PolyFormer-L
Referring Expression SegmentationRefCOCO+ test BOverall IoU61.87PolyFormer-L
Referring Expression SegmentationRefCOCO+ test BMean IoU64.64PolyFormer-B
Referring Expression SegmentationRefCOCO+ test BOverall IoU59.33PolyFormer-B
Referring Expression SegmentationDAVIS 2017 (val)J&F 1st frame60.9PolyFormer-B
Referring Expression SegmentationRefCOCO+ testAMean IoU75.71PolyFormer-L
Referring Expression SegmentationRefCOCO+ testAOverall IoU74.56PolyFormer-L
Referring Expression SegmentationRefCOCO+ testAMean IoU74.51PolyFormer-B
Referring Expression SegmentationRefCOCO+ testAOverall IoU72.89PolyFormer-B
Referring Expression SegmentationReferItMean IoU67.22PolyFormer-L
Referring Expression SegmentationReferItOverall IoU72.6PolyFormer-L
Referring Expression SegmentationReferItMean IoU65.98PolyFormer-B
Referring Expression SegmentationReferItOverall IoU71.91PolyFormer-B
Referring Expression SegmentationRefCOCOg-valMean IoU71.15PolyFormer-L
Referring Expression SegmentationRefCOCOg-valOverall IoU69.2PolyFormer-L
Referring Expression SegmentationRefCOCOg-valMean IoU69.36PolyFormer-B
Referring Expression SegmentationRefCOCOg-valOverall IoU67.76PolyFormer-B

Related Papers

Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation2025-09-04SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC2025-07-18Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17Angle Estimation of a Single Source with Massive Uniform Circular Arrays2025-07-17Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17