TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Inversion-Free Image Editing with Natural Language

Inversion-Free Image Editing with Natural Language

Sihan Xu, Yidong Huang, Jiayi Pan, Ziqiao Ma, Joyce Chai

2023-12-07Image ManipulationText-based Image Editing
PaperPDFCode(official)

Abstract

Despite recent advances in inversion-based editing, text-guided image manipulation remains challenging for diffusion models. The primary bottlenecks include 1) the time-consuming nature of the inversion process; 2) the struggle to balance consistency with accuracy; 3) the lack of compatibility with efficient consistency sampling methods used in consistency models. To address the above issues, we start by asking ourselves if the inversion process can be eliminated for editing. We show that when the initial sample is known, a special variance schedule reduces the denoising step to the same form as the multi-step consistency sampling. We name this Denoising Diffusion Consistent Model (DDCM), and note that it implies a virtual inversion strategy without explicit inversion in sampling. We further unify the attention control mechanisms in a tuning-free framework for text-guided editing. Combining them, we present inversion-free editing (InfEdit), which allows for consistent and faithful editing for both rigid and non-rigid semantic changes, catering to intricate modifications without compromising on the image's integrity and explicit inversion. Through extensive experiments, InfEdit shows strong performance in various editing tasks and also maintains a seamless workflow (less than 3 seconds on one single A40), demonstrating the potential for real-time applications. Project Page: https://sled-group.github.io/InfEdit/

Results

TaskDatasetMetricValueModel
Image GenerationPIE-BenchBackground LPIPS47.58Virtual Inversion+Unified Attention Control+LCM
Image GenerationPIE-BenchBackground PSNR28.51Virtual Inversion+Unified Attention Control+LCM
Image GenerationPIE-BenchCLIPSIM25.03Virtual Inversion+Unified Attention Control+LCM
Image GenerationPIE-BenchStructure Distance13.78Virtual Inversion+Unified Attention Control+LCM
Image GenerationPIE-BenchBackground LPIPS47.98Virtual Inversion+Prompt-to-Prompt
Image GenerationPIE-BenchBackground PSNR27.52Virtual Inversion+Prompt-to-Prompt
Image GenerationPIE-BenchCLIPSIM24.89Virtual Inversion+Prompt-to-Prompt
Image GenerationPIE-BenchStructure Distance14.22Virtual Inversion+Prompt-to-Prompt
Image GenerationPIE-BenchBackground LPIPS55.85Virtual Inversion+Prompt-to-Prompt+LCM
Image GenerationPIE-BenchBackground PSNR26.64Virtual Inversion+Prompt-to-Prompt+LCM
Image GenerationPIE-BenchCLIPSIM24.57Virtual Inversion+Prompt-to-Prompt+LCM
Image GenerationPIE-BenchStructure Distance15.61Virtual Inversion+Prompt-to-Prompt+LCM
Text-to-Image GenerationPIE-BenchBackground LPIPS47.58Virtual Inversion+Unified Attention Control+LCM
Text-to-Image GenerationPIE-BenchBackground PSNR28.51Virtual Inversion+Unified Attention Control+LCM
Text-to-Image GenerationPIE-BenchCLIPSIM25.03Virtual Inversion+Unified Attention Control+LCM
Text-to-Image GenerationPIE-BenchStructure Distance13.78Virtual Inversion+Unified Attention Control+LCM
Text-to-Image GenerationPIE-BenchBackground LPIPS47.98Virtual Inversion+Prompt-to-Prompt
Text-to-Image GenerationPIE-BenchBackground PSNR27.52Virtual Inversion+Prompt-to-Prompt
Text-to-Image GenerationPIE-BenchCLIPSIM24.89Virtual Inversion+Prompt-to-Prompt
Text-to-Image GenerationPIE-BenchStructure Distance14.22Virtual Inversion+Prompt-to-Prompt
Text-to-Image GenerationPIE-BenchBackground LPIPS55.85Virtual Inversion+Prompt-to-Prompt+LCM
Text-to-Image GenerationPIE-BenchBackground PSNR26.64Virtual Inversion+Prompt-to-Prompt+LCM
Text-to-Image GenerationPIE-BenchCLIPSIM24.57Virtual Inversion+Prompt-to-Prompt+LCM
Text-to-Image GenerationPIE-BenchStructure Distance15.61Virtual Inversion+Prompt-to-Prompt+LCM
10-shot image generationPIE-BenchBackground LPIPS47.58Virtual Inversion+Unified Attention Control+LCM
10-shot image generationPIE-BenchBackground PSNR28.51Virtual Inversion+Unified Attention Control+LCM
10-shot image generationPIE-BenchCLIPSIM25.03Virtual Inversion+Unified Attention Control+LCM
10-shot image generationPIE-BenchStructure Distance13.78Virtual Inversion+Unified Attention Control+LCM
10-shot image generationPIE-BenchBackground LPIPS47.98Virtual Inversion+Prompt-to-Prompt
10-shot image generationPIE-BenchBackground PSNR27.52Virtual Inversion+Prompt-to-Prompt
10-shot image generationPIE-BenchCLIPSIM24.89Virtual Inversion+Prompt-to-Prompt
10-shot image generationPIE-BenchStructure Distance14.22Virtual Inversion+Prompt-to-Prompt
10-shot image generationPIE-BenchBackground LPIPS55.85Virtual Inversion+Prompt-to-Prompt+LCM
10-shot image generationPIE-BenchBackground PSNR26.64Virtual Inversion+Prompt-to-Prompt+LCM
10-shot image generationPIE-BenchCLIPSIM24.57Virtual Inversion+Prompt-to-Prompt+LCM
10-shot image generationPIE-BenchStructure Distance15.61Virtual Inversion+Prompt-to-Prompt+LCM
1 Image, 2*2 StitchiPIE-BenchBackground LPIPS47.58Virtual Inversion+Unified Attention Control+LCM
1 Image, 2*2 StitchiPIE-BenchBackground PSNR28.51Virtual Inversion+Unified Attention Control+LCM
1 Image, 2*2 StitchiPIE-BenchCLIPSIM25.03Virtual Inversion+Unified Attention Control+LCM
1 Image, 2*2 StitchiPIE-BenchStructure Distance13.78Virtual Inversion+Unified Attention Control+LCM
1 Image, 2*2 StitchiPIE-BenchBackground LPIPS47.98Virtual Inversion+Prompt-to-Prompt
1 Image, 2*2 StitchiPIE-BenchBackground PSNR27.52Virtual Inversion+Prompt-to-Prompt
1 Image, 2*2 StitchiPIE-BenchCLIPSIM24.89Virtual Inversion+Prompt-to-Prompt
1 Image, 2*2 StitchiPIE-BenchStructure Distance14.22Virtual Inversion+Prompt-to-Prompt
1 Image, 2*2 StitchiPIE-BenchBackground LPIPS55.85Virtual Inversion+Prompt-to-Prompt+LCM
1 Image, 2*2 StitchiPIE-BenchBackground PSNR26.64Virtual Inversion+Prompt-to-Prompt+LCM
1 Image, 2*2 StitchiPIE-BenchCLIPSIM24.57Virtual Inversion+Prompt-to-Prompt+LCM
1 Image, 2*2 StitchiPIE-BenchStructure Distance15.61Virtual Inversion+Prompt-to-Prompt+LCM

Related Papers

NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining2025-07-18Beyond Fully Supervised Pixel Annotations: Scribble-Driven Weakly-Supervised Framework for Image Manipulation Localization2025-07-17Towards Reliable Identification of Diffusion-based Image Manipulations2025-06-05UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation2025-06-03Weakly-supervised Localization of Manipulated Image Regions Using Multi-resolution Learned Features2025-05-29Cora: Correspondence-aware image editing using few step diffusion2025-05-29RBench-V: A Primary Assessment for Visual Reasoning Models with Multi-modal Outputs2025-05-22My Face Is Mine, Not Yours: Facial Protection Against Diffusion Model Face Swapping2025-05-21