TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Pretraining is All You Need for Image-to-Image Translation

Pretraining is All You Need for Image-to-Image Translation

Tengfei Wang, Ting Zhang, Bo Zhang, Hao Ouyang, Dong Chen, Qifeng Chen, Fang Wen

2022-05-25Sketch-to-Image TranslationTranslationTexture SynthesisAllImage-to-Image Translation
PaperPDFCode(official)Code

Abstract

We propose to use pretraining to boost general image-to-image translation. Prior image-to-image translation methods usually need dedicated architectural design and train individual translation models from scratch, struggling for high-quality generation of complex scenes, especially when paired training data are not abundant. In this paper, we regard each image-to-image translation problem as a downstream task and introduce a simple and generic framework that adapts a pretrained diffusion model to accommodate various kinds of image-to-image translation. We also propose adversarial training to enhance the texture synthesis in the diffusion model training, in conjunction with normalized guidance sampling to improve the generation quality. We present extensive empirical comparison across various tasks on challenging benchmarks such as ADE20K, COCO-Stuff, and DIODE, showing the proposed pretraining-based image-to-image translation (PITI) is capable of synthesizing images of unprecedented realism and faithfulness.

Results

TaskDatasetMetricValueModel
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosFID15.6PITI
Image GenerationCOCO-Stuff Labels-to-PhotosFID15.6PITI
Sketch-to-Image TranslationCOCO-StuffFID18.5PITI
Sketch-to-Image TranslationCOCO-StuffFID-C7.6PITI
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosFID15.6PITI

Related Papers

A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17Function-to-Style Guidance of LLMs for Code Translation2025-07-15Modeling Code: Is Text All You Need?2025-07-15All Eyes, no IMU: Learning Flight Attitude from Vision Alone2025-07-15Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09Pun Intended: Multi-Agent Translation of Wordplay with Contrastive Learning and Phonetic-Semantic Embeddings2025-07-09Unconditional Diffusion for Generative Sequential Recommendation2025-07-08Is Diversity All You Need for Scalable Robotic Manipulation?2025-07-08