TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Photographic Image Synthesis with Cascaded Refinement Netw...

Photographic Image Synthesis with Cascaded Refinement Networks

Qifeng Chen, Vladlen Koltun

2017-07-28ICCV 2017 10Image GenerationImage-to-Image Translation
PaperPDF

Abstract

We present an approach to synthesizing photographic images conditioned on semantic layouts. Given a semantic label map, our approach produces an image with photographic appearance that conforms to the input layout. The approach thus functions as a rendering engine that takes a two-dimensional semantic specification of the scene and produces a corresponding photographic image. Unlike recent and contemporaneous work, our approach does not rely on adversarial training. We show that photographic images can be synthesized from semantic layouts by a single feedforward network with appropriate structure, trained end-to-end with a direct regression objective. The presented approach scales seamlessly to high resolutions; we demonstrate this by synthesizing photographic images at 2-megapixel resolution, the full resolution of our training data. Extensive perceptual experiments on datasets of outdoor and indoor scenes demonstrate that images synthesized by the presented approach are considerably more realistic than alternative approaches. The results are shown in the supplementary video at https://youtu.be/0fhUJT21-bs

Results

TaskDatasetMetricValueModel
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosFID70.4CRN
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosmIoU23.7CRN
Image-to-Image TranslationCityscapes Labels-to-PhotoFID104.7CRN
Image-to-Image TranslationCityscapes Labels-to-PhotomIoU52.4CRN
Image-to-Image TranslationADE20K Labels-to-PhotosFID73.3CRN
Image-to-Image TranslationADE20K Labels-to-PhotosmIoU22.4CRN
Image-to-Image TranslationADE20K-Outdoor Labels-to-PhotosFID99CRN
Image-to-Image TranslationADE20K-Outdoor Labels-to-PhotosmIoU16.5CRN
Image GenerationCOCO-Stuff Labels-to-PhotosFID70.4CRN
Image GenerationCOCO-Stuff Labels-to-PhotosmIoU23.7CRN
Image GenerationCityscapes Labels-to-PhotoFID104.7CRN
Image GenerationCityscapes Labels-to-PhotomIoU52.4CRN
Image GenerationADE20K Labels-to-PhotosFID73.3CRN
Image GenerationADE20K Labels-to-PhotosmIoU22.4CRN
Image GenerationADE20K-Outdoor Labels-to-PhotosFID99CRN
Image GenerationADE20K-Outdoor Labels-to-PhotosmIoU16.5CRN
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosFID70.4CRN
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosmIoU23.7CRN
1 Image, 2*2 StitchingCityscapes Labels-to-PhotoFID104.7CRN
1 Image, 2*2 StitchingCityscapes Labels-to-PhotomIoU52.4CRN
1 Image, 2*2 StitchingADE20K Labels-to-PhotosFID73.3CRN
1 Image, 2*2 StitchingADE20K Labels-to-PhotosmIoU22.4CRN
1 Image, 2*2 StitchingADE20K-Outdoor Labels-to-PhotosFID99CRN
1 Image, 2*2 StitchingADE20K-Outdoor Labels-to-PhotosmIoU16.5CRN

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17FADE: Adversarial Concept Erasure in Flow Models2025-07-16CharaConsist: Fine-Grained Consistent Character Generation2025-07-15CATVis: Context-Aware Thought Visualization2025-07-15