TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Cross-View Image Synthesis using Conditional GANs

Cross-View Image Synthesis using Conditional GANs

Krishna Regmi, Ali Borji

2018-03-09CVPR 2018 6Scene GenerationSegmentationSemantic SegmentationImage GenerationCross-View Image-to-Image TranslationImage-to-Image Translation
PaperPDFCode(official)

Abstract

Learning to generate natural scenes has always been a challenging task in computer vision. It is even more painstaking when the generation is conditioned on images with drastically different views. This is mainly because understanding, corresponding, and transforming appearance and semantic information across the views is not trivial. In this paper, we attempt to solve the novel problem of cross-view image synthesis, aerial to street-view and vice versa, using conditional generative adversarial networks (cGAN). Two new architectures called Crossview Fork (X-Fork) and Crossview Sequential (X-Seq) are proposed to generate scenes with resolutions of 64x64 and 256x256 pixels. X-Fork architecture has a single discriminator and a single generator. The generator hallucinates both the image and its semantic segmentation in the target view. X-Seq architecture utilizes two cGANs. The first one generates the target image which is subsequently fed to the second cGAN for generating its corresponding semantic segmentation map. The feedback from the second cGAN helps the first cGAN generate sharper images. Both of our proposed architectures learn to generate natural images as well as their semantic segmentation maps. The proposed methods show that they are able to capture and maintain the true semantics of objects in source and target views better than the traditional image-to-image translation method which considers only the visual appearance of the scene. Extensive qualitative and quantitative evaluations support the effectiveness of our frameworks, compared to two state of the art methods, for natural scene generation across drastically different views.

Results

TaskDatasetMetricValueModel
Image-to-Image TranslationDayton (64x64) - ground-to-aerialSSIM0.3682X-Fork
Image-to-Image TranslationDayton (64x64) - ground-to-aerialSSIM0.3663X-Seq
Image-to-Image TranslationcvusaSSIM0.4356X-Fork
Image-to-Image TranslationcvusaSSIM0.4231X-Seq
Image-to-Image TranslationDayton (64×64) - aerial-to-groundSSIM0.5171X-Seq
Image-to-Image TranslationDayton (64×64) - aerial-to-groundSSIM0.4921X-Fork
Image-to-Image TranslationEgo2TopSSIM0.274X-Fork
Image-to-Image TranslationEgo2TopSSIM0.2738X-Seq
Image-to-Image TranslationDayton (256×256) - ground-to-aerialSSIM0.2725X-Seq
Image-to-Image TranslationDayton (256×256) - aerial-to-groundSSIM0.5031X-Seq
Image-to-Image TranslationDayton (256×256) - aerial-to-groundSSIM0.4963X-Fork
Image GenerationDayton (64x64) - ground-to-aerialSSIM0.3682X-Fork
Image GenerationDayton (64x64) - ground-to-aerialSSIM0.3663X-Seq
Image GenerationcvusaSSIM0.4356X-Fork
Image GenerationcvusaSSIM0.4231X-Seq
Image GenerationDayton (64×64) - aerial-to-groundSSIM0.5171X-Seq
Image GenerationDayton (64×64) - aerial-to-groundSSIM0.4921X-Fork
Image GenerationEgo2TopSSIM0.274X-Fork
Image GenerationEgo2TopSSIM0.2738X-Seq
Image GenerationDayton (256×256) - ground-to-aerialSSIM0.2725X-Seq
Image GenerationDayton (256×256) - aerial-to-groundSSIM0.5031X-Seq
Image GenerationDayton (256×256) - aerial-to-groundSSIM0.4963X-Fork
1 Image, 2*2 StitchingDayton (64x64) - ground-to-aerialSSIM0.3682X-Fork
1 Image, 2*2 StitchingDayton (64x64) - ground-to-aerialSSIM0.3663X-Seq
1 Image, 2*2 StitchingcvusaSSIM0.4356X-Fork
1 Image, 2*2 StitchingcvusaSSIM0.4231X-Seq
1 Image, 2*2 StitchingDayton (64×64) - aerial-to-groundSSIM0.5171X-Seq
1 Image, 2*2 StitchingDayton (64×64) - aerial-to-groundSSIM0.4921X-Fork
1 Image, 2*2 StitchingEgo2TopSSIM0.274X-Fork
1 Image, 2*2 StitchingEgo2TopSSIM0.2738X-Seq
1 Image, 2*2 StitchingDayton (256×256) - ground-to-aerialSSIM0.2725X-Seq
1 Image, 2*2 StitchingDayton (256×256) - aerial-to-groundSSIM0.5031X-Seq
1 Image, 2*2 StitchingDayton (256×256) - aerial-to-groundSSIM0.4963X-Fork

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17