Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code

Xuan Ju, Ailing Zeng, Yuxuan Bian, Shaoteng Liu, Qiang Xu

2023-10-02Image Generation Text-based Image Editing

Abstract

Text-guided diffusion models have revolutionized image generation and editing, offering exceptional realism and diversity. Specifically, in the context of diffusion-based editing, where a source image is edited according to a target prompt, the process commences by acquiring a noisy latent vector corresponding to the source image via the diffusion model. This vector is subsequently fed into separate source and target diffusion branches for editing. The accuracy of this inversion process significantly impacts the final editing outcome, influencing both essential content preservation of the source image and edit fidelity according to the target prompt. Prior inversion techniques aimed at finding a unified solution in both the source and target diffusion branches. However, our theoretical and empirical analyses reveal that disentangling these branches leads to a distinct separation of responsibilities for preserving essential content and ensuring edit fidelity. Building on this insight, we introduce "Direct Inversion," a novel technique achieving optimal performance of both branches with just three lines of code. To assess image editing performance, we present PIE-Bench, an editing benchmark with 700 images showcasing diverse scenes and editing types, accompanied by versatile annotations and comprehensive evaluation metrics. Compared to state-of-the-art optimization-based inversion techniques, our solution not only yields superior performance across 8 editing methods but also achieves nearly an order of speed-up.

Results

Task	Dataset	Metric	Value	Model
Image Generation	PIE-Bench	Background LPIPS	54.55	Direct Inversion+Prompt-to-Prompt
Image Generation	PIE-Bench	Background PSNR	27.22	Direct Inversion+Prompt-to-Prompt
Image Generation	PIE-Bench	CLIPSIM	25.02	Direct Inversion+Prompt-to-Prompt
Image Generation	PIE-Bench	Structure Distance	11.65	Direct Inversion+Prompt-to-Prompt
Image Generation	PIE-Bench	Background LPIPS	87.94	Direct Inversion+MasaCtrl
Image Generation	PIE-Bench	Background PSNR	22.64	Direct Inversion+MasaCtrl
Image Generation	PIE-Bench	CLIPSIM	24.38	Direct Inversion+MasaCtrl
Image Generation	PIE-Bench	Structure Distance	24.7	Direct Inversion+MasaCtrl
Image Generation	PIE-Bench	Background LPIPS	106.06	Direct Inversion+Plug-and-Play
Image Generation	PIE-Bench	Background PSNR	22.46	Direct Inversion+Plug-and-Play
Image Generation	PIE-Bench	CLIPSIM	25.41	Direct Inversion+Plug-and-Play
Image Generation	PIE-Bench	Structure Distance	24.29	Direct Inversion+Plug-and-Play
Image Generation	PIE-Bench	Background LPIPS	138.98	Direct Inversion+Pix2Pix-Zero
Image Generation	PIE-Bench	Background PSNR	21.53	Direct Inversion+Pix2Pix-Zero
Image Generation	PIE-Bench	CLIPSIM	23.31	Direct Inversion+Pix2Pix-Zero
Image Generation	PIE-Bench	Structure Distance	49.22	Direct Inversion+Pix2Pix-Zero
Text-to-Image Generation	PIE-Bench	Background LPIPS	54.55	Direct Inversion+Prompt-to-Prompt
Text-to-Image Generation	PIE-Bench	Background PSNR	27.22	Direct Inversion+Prompt-to-Prompt
Text-to-Image Generation	PIE-Bench	CLIPSIM	25.02	Direct Inversion+Prompt-to-Prompt
Text-to-Image Generation	PIE-Bench	Structure Distance	11.65	Direct Inversion+Prompt-to-Prompt
Text-to-Image Generation	PIE-Bench	Background LPIPS	87.94	Direct Inversion+MasaCtrl
Text-to-Image Generation	PIE-Bench	Background PSNR	22.64	Direct Inversion+MasaCtrl
Text-to-Image Generation	PIE-Bench	CLIPSIM	24.38	Direct Inversion+MasaCtrl
Text-to-Image Generation	PIE-Bench	Structure Distance	24.7	Direct Inversion+MasaCtrl
Text-to-Image Generation	PIE-Bench	Background LPIPS	106.06	Direct Inversion+Plug-and-Play
Text-to-Image Generation	PIE-Bench	Background PSNR	22.46	Direct Inversion+Plug-and-Play
Text-to-Image Generation	PIE-Bench	CLIPSIM	25.41	Direct Inversion+Plug-and-Play
Text-to-Image Generation	PIE-Bench	Structure Distance	24.29	Direct Inversion+Plug-and-Play
Text-to-Image Generation	PIE-Bench	Background LPIPS	138.98	Direct Inversion+Pix2Pix-Zero
Text-to-Image Generation	PIE-Bench	Background PSNR	21.53	Direct Inversion+Pix2Pix-Zero
Text-to-Image Generation	PIE-Bench	CLIPSIM	23.31	Direct Inversion+Pix2Pix-Zero
Text-to-Image Generation	PIE-Bench	Structure Distance	49.22	Direct Inversion+Pix2Pix-Zero
10-shot image generation	PIE-Bench	Background LPIPS	54.55	Direct Inversion+Prompt-to-Prompt
10-shot image generation	PIE-Bench	Background PSNR	27.22	Direct Inversion+Prompt-to-Prompt
10-shot image generation	PIE-Bench	CLIPSIM	25.02	Direct Inversion+Prompt-to-Prompt
10-shot image generation	PIE-Bench	Structure Distance	11.65	Direct Inversion+Prompt-to-Prompt
10-shot image generation	PIE-Bench	Background LPIPS	87.94	Direct Inversion+MasaCtrl
10-shot image generation	PIE-Bench	Background PSNR	22.64	Direct Inversion+MasaCtrl
10-shot image generation	PIE-Bench	CLIPSIM	24.38	Direct Inversion+MasaCtrl
10-shot image generation	PIE-Bench	Structure Distance	24.7	Direct Inversion+MasaCtrl
10-shot image generation	PIE-Bench	Background LPIPS	106.06	Direct Inversion+Plug-and-Play
10-shot image generation	PIE-Bench	Background PSNR	22.46	Direct Inversion+Plug-and-Play
10-shot image generation	PIE-Bench	CLIPSIM	25.41	Direct Inversion+Plug-and-Play
10-shot image generation	PIE-Bench	Structure Distance	24.29	Direct Inversion+Plug-and-Play
10-shot image generation	PIE-Bench	Background LPIPS	138.98	Direct Inversion+Pix2Pix-Zero
10-shot image generation	PIE-Bench	Background PSNR	21.53	Direct Inversion+Pix2Pix-Zero
10-shot image generation	PIE-Bench	CLIPSIM	23.31	Direct Inversion+Pix2Pix-Zero
10-shot image generation	PIE-Bench	Structure Distance	49.22	Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 Stitchi	PIE-Bench	Background LPIPS	54.55	Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 Stitchi	PIE-Bench	Background PSNR	27.22	Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 Stitchi	PIE-Bench	CLIPSIM	25.02	Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 Stitchi	PIE-Bench	Structure Distance	11.65	Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 Stitchi	PIE-Bench	Background LPIPS	87.94	Direct Inversion+MasaCtrl
1 Image, 2*2 Stitchi	PIE-Bench	Background PSNR	22.64	Direct Inversion+MasaCtrl
1 Image, 2*2 Stitchi	PIE-Bench	CLIPSIM	24.38	Direct Inversion+MasaCtrl
1 Image, 2*2 Stitchi	PIE-Bench	Structure Distance	24.7	Direct Inversion+MasaCtrl
1 Image, 2*2 Stitchi	PIE-Bench	Background LPIPS	106.06	Direct Inversion+Plug-and-Play
1 Image, 2*2 Stitchi	PIE-Bench	Background PSNR	22.46	Direct Inversion+Plug-and-Play
1 Image, 2*2 Stitchi	PIE-Bench	CLIPSIM	25.41	Direct Inversion+Plug-and-Play
1 Image, 2*2 Stitchi	PIE-Bench	Structure Distance	24.29	Direct Inversion+Plug-and-Play
1 Image, 2*2 Stitchi	PIE-Bench	Background LPIPS	138.98	Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 Stitchi	PIE-Bench	Background PSNR	21.53	Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 Stitchi	PIE-Bench	CLIPSIM	23.31	Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 Stitchi	PIE-Bench	Structure Distance	49.22	Direct Inversion+Pix2Pix-Zero

Abstract

Results

Task	Dataset	Metric	Value	Model
Image Generation	PIE-Bench	Background LPIPS	54.55	Direct Inversion+Prompt-to-Prompt
Image Generation	PIE-Bench	Background PSNR	27.22	Direct Inversion+Prompt-to-Prompt
Image Generation	PIE-Bench	CLIPSIM	25.02	Direct Inversion+Prompt-to-Prompt
Image Generation	PIE-Bench	Structure Distance	11.65	Direct Inversion+Prompt-to-Prompt
Image Generation	PIE-Bench	Background LPIPS	87.94	Direct Inversion+MasaCtrl
Image Generation	PIE-Bench	Background PSNR	22.64	Direct Inversion+MasaCtrl
Image Generation	PIE-Bench	CLIPSIM	24.38	Direct Inversion+MasaCtrl
Image Generation	PIE-Bench	Structure Distance	24.7	Direct Inversion+MasaCtrl
Image Generation	PIE-Bench	Background LPIPS	106.06	Direct Inversion+Plug-and-Play
Image Generation	PIE-Bench	Background PSNR	22.46	Direct Inversion+Plug-and-Play
Image Generation	PIE-Bench	CLIPSIM	25.41	Direct Inversion+Plug-and-Play
Image Generation	PIE-Bench	Structure Distance	24.29	Direct Inversion+Plug-and-Play
Image Generation	PIE-Bench	Background LPIPS	138.98	Direct Inversion+Pix2Pix-Zero
Image Generation	PIE-Bench	Background PSNR	21.53	Direct Inversion+Pix2Pix-Zero
Image Generation	PIE-Bench	CLIPSIM	23.31	Direct Inversion+Pix2Pix-Zero
Image Generation	PIE-Bench	Structure Distance	49.22	Direct Inversion+Pix2Pix-Zero
Text-to-Image Generation	PIE-Bench	Background LPIPS	54.55	Direct Inversion+Prompt-to-Prompt
Text-to-Image Generation	PIE-Bench	Background PSNR	27.22	Direct Inversion+Prompt-to-Prompt
Text-to-Image Generation	PIE-Bench	CLIPSIM	25.02	Direct Inversion+Prompt-to-Prompt
Text-to-Image Generation	PIE-Bench	Structure Distance	11.65	Direct Inversion+Prompt-to-Prompt
Text-to-Image Generation	PIE-Bench	Background LPIPS	87.94	Direct Inversion+MasaCtrl
Text-to-Image Generation	PIE-Bench	Background PSNR	22.64	Direct Inversion+MasaCtrl
Text-to-Image Generation	PIE-Bench	CLIPSIM	24.38	Direct Inversion+MasaCtrl
Text-to-Image Generation	PIE-Bench	Structure Distance	24.7	Direct Inversion+MasaCtrl
Text-to-Image Generation	PIE-Bench	Background LPIPS	106.06	Direct Inversion+Plug-and-Play
Text-to-Image Generation	PIE-Bench	Background PSNR	22.46	Direct Inversion+Plug-and-Play
Text-to-Image Generation	PIE-Bench	CLIPSIM	25.41	Direct Inversion+Plug-and-Play
Text-to-Image Generation	PIE-Bench	Structure Distance	24.29	Direct Inversion+Plug-and-Play
Text-to-Image Generation	PIE-Bench	Background LPIPS	138.98	Direct Inversion+Pix2Pix-Zero
Text-to-Image Generation	PIE-Bench	Background PSNR	21.53	Direct Inversion+Pix2Pix-Zero
Text-to-Image Generation	PIE-Bench	CLIPSIM	23.31	Direct Inversion+Pix2Pix-Zero
Text-to-Image Generation	PIE-Bench	Structure Distance	49.22	Direct Inversion+Pix2Pix-Zero
10-shot image generation	PIE-Bench	Background LPIPS	54.55	Direct Inversion+Prompt-to-Prompt
10-shot image generation	PIE-Bench	Background PSNR	27.22	Direct Inversion+Prompt-to-Prompt
10-shot image generation	PIE-Bench	CLIPSIM	25.02	Direct Inversion+Prompt-to-Prompt
10-shot image generation	PIE-Bench	Structure Distance	11.65	Direct Inversion+Prompt-to-Prompt
10-shot image generation	PIE-Bench	Background LPIPS	87.94	Direct Inversion+MasaCtrl
10-shot image generation	PIE-Bench	Background PSNR	22.64	Direct Inversion+MasaCtrl
10-shot image generation	PIE-Bench	CLIPSIM	24.38	Direct Inversion+MasaCtrl
10-shot image generation	PIE-Bench	Structure Distance	24.7	Direct Inversion+MasaCtrl
10-shot image generation	PIE-Bench	Background LPIPS	106.06	Direct Inversion+Plug-and-Play
10-shot image generation	PIE-Bench	Background PSNR	22.46	Direct Inversion+Plug-and-Play
10-shot image generation	PIE-Bench	CLIPSIM	25.41	Direct Inversion+Plug-and-Play
10-shot image generation	PIE-Bench	Structure Distance	24.29	Direct Inversion+Plug-and-Play
10-shot image generation	PIE-Bench	Background LPIPS	138.98	Direct Inversion+Pix2Pix-Zero
10-shot image generation	PIE-Bench	Background PSNR	21.53	Direct Inversion+Pix2Pix-Zero
10-shot image generation	PIE-Bench	CLIPSIM	23.31	Direct Inversion+Pix2Pix-Zero
10-shot image generation	PIE-Bench	Structure Distance	49.22	Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 Stitchi	PIE-Bench	Background LPIPS	54.55	Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 Stitchi	PIE-Bench	Background PSNR	27.22	Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 Stitchi	PIE-Bench	CLIPSIM	25.02	Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 Stitchi	PIE-Bench	Structure Distance	11.65	Direct Inversion+Prompt-to-Prompt
1 Image, 2*2 Stitchi	PIE-Bench	Background LPIPS	87.94	Direct Inversion+MasaCtrl
1 Image, 2*2 Stitchi	PIE-Bench	Background PSNR	22.64	Direct Inversion+MasaCtrl
1 Image, 2*2 Stitchi	PIE-Bench	CLIPSIM	24.38	Direct Inversion+MasaCtrl
1 Image, 2*2 Stitchi	PIE-Bench	Structure Distance	24.7	Direct Inversion+MasaCtrl
1 Image, 2*2 Stitchi	PIE-Bench	Background LPIPS	106.06	Direct Inversion+Plug-and-Play
1 Image, 2*2 Stitchi	PIE-Bench	Background PSNR	22.46	Direct Inversion+Plug-and-Play
1 Image, 2*2 Stitchi	PIE-Bench	CLIPSIM	25.41	Direct Inversion+Plug-and-Play
1 Image, 2*2 Stitchi	PIE-Bench	Structure Distance	24.29	Direct Inversion+Plug-and-Play
1 Image, 2*2 Stitchi	PIE-Bench	Background LPIPS	138.98	Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 Stitchi	PIE-Bench	Background PSNR	21.53	Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 Stitchi	PIE-Bench	CLIPSIM	23.31	Direct Inversion+Pix2Pix-Zero
1 Image, 2*2 Stitchi	PIE-Bench	Structure Distance	49.22	Direct Inversion+Pix2Pix-Zero

Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code

Abstract

Results

Related Papers

Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code

Abstract

Results

Related Papers