High-Fidelity Pluralistic Image Completion with Transformers

Ziyu Wan, Jingbo Zhang, Dongdong Chen, Jing Liao

2021-03-25ICCV 2021 10Vocal Bursts Intensity Prediction Image Inpainting

Abstract

Image completion has made tremendous progress with convolutional neural networks (CNNs), because of their powerful texture modeling capacity. However, due to some inherent properties (e.g., local inductive prior, spatial-invariant kernels), CNNs do not perform well in understanding global structures or naturally support pluralistic completion. Recently, transformers demonstrate their power in modeling the long-term relationship and generating diverse results, but their computation complexity is quadratic to input length, thus hampering the application in processing high-resolution images. This paper brings the best of both worlds to pluralistic image completion: appearance prior reconstruction with transformer and texture replenishment with CNN. The former transformer recovers pluralistic coherent structures together with some coarse textures, while the latter CNN enhances the local texture details of coarse priors guided by the high-resolution masked images. The proposed method vastly outperforms state-of-the-art methods in terms of three aspects: 1) large performance boost on image fidelity even compared to deterministic completion methods; 2) better diversity and higher fidelity for pluralistic completion; 3) exceptional generalization ability on large masks and generic dataset, like ImageNet.

Results

Task	Dataset	Metric	Value	Model
Image Generation	CelebA-HQ	FID	12.84	ICT
Image Generation	CelebA-HQ	P-IDS	0.13	ICT
Image Generation	CelebA-HQ	U-IDS	0.58	ICT
Image Inpainting	CelebA-HQ	FID	12.84	ICT
Image Inpainting	CelebA-HQ	P-IDS	0.13	ICT
Image Inpainting	CelebA-HQ	U-IDS	0.58	ICT

Related Papers

RePaintGS: Reference-Guided Gaussian Splatting for Realistic and View-Consistent 3D Scene Inpainting2025-07-11 MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting2025-06-30 3DeepRep: 3D Deep Low-rank Tensor Representation for Hyperspectral Image Inpainting2025-06-20 Geological Field Restoration through the Lens of Image Inpainting2025-06-05 DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds2025-05-30 Structure Disruption: Subverting Malicious Diffusion-Based Inpainting via Self-Attention Query Perturbation2025-05-26 Unsupervised Raindrop Removal from a Single Image using Conditional Diffusion Models2025-05-13 CaRaFFusion: Improving 2D Semantic Segmentation with Camera-Radar Point Cloud Fusion and Zero-Shot Image Inpainting2025-05-06