TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Direct Inversion: Boosting Diffusion-based Editing with 3 ...

Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code

Xuan Ju, Ailing Zeng, Yuxuan Bian, Shaoteng Liu, Qiang Xu

2023-10-02Image GenerationText-based Image Editing
PaperPDFCodeCode(official)Code

Abstract

Text-guided diffusion models have revolutionized image generation and editing, offering exceptional realism and diversity. Specifically, in the context of diffusion-based editing, where a source image is edited according to a target prompt, the process commences by acquiring a noisy latent vector corresponding to the source image via the diffusion model. This vector is subsequently fed into separate source and target diffusion branches for editing. The accuracy of this inversion process significantly impacts the final editing outcome, influencing both essential content preservation of the source image and edit fidelity according to the target prompt. Prior inversion techniques aimed at finding a unified solution in both the source and target diffusion branches. However, our theoretical and empirical analyses reveal that disentangling these branches leads to a distinct separation of responsibilities for preserving essential content and ensuring edit fidelity. Building on this insight, we introduce "Direct Inversion," a novel technique achieving optimal performance of both branches with just three lines of code. To assess image editing performance, we present PIE-Bench, an editing benchmark with 700 images showcasing diverse scenes and editing types, accompanied by versatile annotations and comprehensive evaluation metrics. Compared to state-of-the-art optimization-based inversion techniques, our solution not only yields superior performance across 8 editing methods but also achieves nearly an order of speed-up.

Results

TaskDatasetMetricValueModel
Image GenerationPIE-BenchBackground LPIPS54.55Direct Inversion+Prompt-to-Prompt
Image GenerationPIE-BenchBackground PSNR27.22Direct Inversion+Prompt-to-Prompt
Image GenerationPIE-BenchCLIPSIM25.02Direct Inversion+Prompt-to-Prompt
Image GenerationPIE-BenchStructure Distance11.65Direct Inversion+Prompt-to-Prompt
Image GenerationPIE-BenchBackground LPIPS87.94Direct Inversion+MasaCtrl
Image GenerationPIE-BenchBackground PSNR22.64Direct Inversion+MasaCtrl
Image GenerationPIE-BenchCLIPSIM24.38Direct Inversion+MasaCtrl
Image GenerationPIE-BenchStructure Distance24.7Direct Inversion+MasaCtrl
Image GenerationPIE-BenchBackground LPIPS106.06Direct Inversion+Plug-and-Play
Image GenerationPIE-BenchBackground PSNR22.46Direct Inversion+Plug-and-Play
Image GenerationPIE-BenchCLIPSIM25.41Direct Inversion+Plug-and-Play
Image GenerationPIE-BenchStructure Distance24.29Direct Inversion+Plug-and-Play
Image GenerationPIE-BenchBackground LPIPS138.98Direct Inversion+Pix2Pix-Zero
Image GenerationPIE-BenchBackground PSNR21.53Direct Inversion+Pix2Pix-Zero
Image GenerationPIE-BenchCLIPSIM23.31Direct Inversion+Pix2Pix-Zero
Image GenerationPIE-BenchStructure Distance49.22Direct Inversion+Pix2Pix-Zero
Text-to-Image GenerationPIE-BenchBackground LPIPS54.55Direct Inversion+Prompt-to-Prompt
Text-to-Image GenerationPIE-BenchBackground PSNR27.22Direct Inversion+Prompt-to-Prompt
Text-to-Image GenerationPIE-BenchCLIPSIM25.02Direct Inversion+Prompt-to-Prompt
Text-to-Image GenerationPIE-BenchStructure Distance11.65Direct Inversion+Prompt-to-Prompt
Text-to-Image GenerationPIE-BenchBackground LPIPS87.94Direct Inversion+MasaCtrl
Text-to-Image GenerationPIE-BenchBackground PSNR22.64Direct Inversion+MasaCtrl
Text-to-Image GenerationPIE-BenchCLIPSIM24.38Direct Inversion+MasaCtrl
Text-to-Image GenerationPIE-BenchStructure Distance24.7Direct Inversion+MasaCtrl
Text-to-Image GenerationPIE-BenchBackground LPIPS106.06Direct Inversion+Plug-and-Play
Text-to-Image GenerationPIE-BenchBackground PSNR22.46Direct Inversion+Plug-and-Play
Text-to-Image GenerationPIE-BenchCLIPSIM25.41Direct Inversion+Plug-and-Play
Text-to-Image GenerationPIE-BenchStructure Distance24.29Direct Inversion+Plug-and-Play
Text-to-Image GenerationPIE-BenchBackground LPIPS138.98Direct Inversion+Pix2Pix-Zero
Text-to-Image GenerationPIE-BenchBackground PSNR21.53Direct Inversion+Pix2Pix-Zero
Text-to-Image GenerationPIE-BenchCLIPSIM23.31Direct Inversion+Pix2Pix-Zero
Text-to-Image GenerationPIE-BenchStructure Distance49.22Direct Inversion+Pix2Pix-Zero
10-shot image generationPIE-BenchBackground LPIPS54.55Direct Inversion+Prompt-to-Prompt
10-shot image generationPIE-BenchBackground PSNR27.22Direct Inversion+Prompt-to-Prompt
10-shot image generationPIE-BenchCLIPSIM25.02Direct Inversion+Prompt-to-Prompt
10-shot image generationPIE-BenchStructure Distance11.65Direct Inversion+Prompt-to-Prompt
10-shot image generationPIE-BenchBackground LPIPS87.94Direct Inversion+MasaCtrl
10-shot image generationPIE-BenchBackground PSNR22.64Direct Inversion+MasaCtrl
10-shot image generationPIE-BenchCLIPSIM24.38Direct Inversion+MasaCtrl
10-shot image generationPIE-BenchStructure Distance24.7Direct Inversion+MasaCtrl
10-shot image generationPIE-BenchBackground LPIPS106.06Direct Inversion+Plug-and-Play
10-shot image generationPIE-BenchBackground PSNR22.46Direct Inversion+Plug-and-Play
10-shot image generationPIE-BenchCLIPSIM25.41Direct Inversion+Plug-and-Play
10-shot image generationPIE-BenchStructure Distance24.29Direct Inversion+Plug-and-Play
10-shot image generationPIE-BenchBackground LPIPS138.98Direct Inversion+Pix2Pix-Zero
10-shot image generationPIE-BenchBackground PSNR21.53Direct Inversion+Pix2Pix-Zero
10-shot image generationPIE-BenchCLIPSIM23.31Direct Inversion+Pix2Pix-Zero
10-shot image generationPIE-BenchStructure Distance49.22Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 StitchiPIE-BenchBackground LPIPS54.55Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 StitchiPIE-BenchBackground PSNR27.22Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 StitchiPIE-BenchCLIPSIM25.02Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 StitchiPIE-BenchStructure Distance11.65Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 StitchiPIE-BenchBackground LPIPS87.94Direct Inversion+MasaCtrl
1 Image, 2*2 StitchiPIE-BenchBackground PSNR22.64Direct Inversion+MasaCtrl
1 Image, 2*2 StitchiPIE-BenchCLIPSIM24.38Direct Inversion+MasaCtrl
1 Image, 2*2 StitchiPIE-BenchStructure Distance24.7Direct Inversion+MasaCtrl
1 Image, 2*2 StitchiPIE-BenchBackground LPIPS106.06Direct Inversion+Plug-and-Play
1 Image, 2*2 StitchiPIE-BenchBackground PSNR22.46Direct Inversion+Plug-and-Play
1 Image, 2*2 StitchiPIE-BenchCLIPSIM25.41Direct Inversion+Plug-and-Play
1 Image, 2*2 StitchiPIE-BenchStructure Distance24.29Direct Inversion+Plug-and-Play
1 Image, 2*2 StitchiPIE-BenchBackground LPIPS138.98Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 StitchiPIE-BenchBackground PSNR21.53Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 StitchiPIE-BenchCLIPSIM23.31Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 StitchiPIE-BenchStructure Distance49.22Direct Inversion+Pix2Pix-Zero

Related Papers

NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining2025-07-18fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17FADE: Adversarial Concept Erasure in Flow Models2025-07-16CharaConsist: Fine-Grained Consistent Character Generation2025-07-15