TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MatteFormer: Transformer-Based Image Matting via Prior-Tok...

MatteFormer: Transformer-Based Image Matting via Prior-Tokens

Gyutae Park, Sungjoon Son, Jaeyoung Yoo, SeHo Kim, Nojun Kwak

2022-03-29CVPR 2022 1Image Matting
PaperPDFCode(official)

Abstract

In this paper, we propose a transformer-based image matting model called MatteFormer, which takes full advantage of trimap information in the transformer block. Our method first introduces a prior-token which is a global representation of each trimap region (e.g. foreground, background and unknown). These prior-tokens are used as global priors and participate in the self-attention mechanism of each block. Each stage of the encoder is composed of PAST (Prior-Attentive Swin Transformer) block, which is based on the Swin Transformer block, but differs in a couple of aspects: 1) It has PA-WSA (Prior-Attentive Window Self-Attention) layer, performing self-attention not only with spatial-tokens but also with prior-tokens. 2) It has prior-memory which saves prior-tokens accumulatively from the previous blocks and transfers them to the next block. We evaluate our MatteFormer on the commonly used image matting datasets: Composition-1k and Distinctions-646. Experiment results show that our proposed method achieves state-of-the-art performance with a large margin. Our codes are available at https://github.com/webtoon/matteformer.

Results

TaskDatasetMetricValueModel
Image MattingComposition-1KConn18.9MatteFormer
Image MattingComposition-1KGrad8.7MatteFormer
Image MattingComposition-1KMSE4MatteFormer
Image MattingComposition-1KSAD23.8MatteFormer

Related Papers

Post-Training Quantization for Video Matting2025-06-12MP-Mat: A 3D-and-Instance-Aware Human Matting and Editing Framework with Multiplane Representation2025-04-20MaSS13K: A Matting-level Semantic Segmentation Benchmark2025-03-24VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion2025-03-11Path-Adaptive Matting for Efficient Inference Under Various Computational Cost Constraints2025-03-05Object-Aware Video Matting with Cross-Frame Guidance2025-03-03Enhancing Image Matting in Real-World Scenes with Mask-Guided Iterative Refinement2025-02-24Efficient Portrait Matte Creation With Layer Diffusion and Connectivity Priors2025-01-27