TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Improving Augmentation and Evaluation Schemes for Semantic...

Improving Augmentation and Evaluation Schemes for Semantic Image Synthesis

Prateek Katiyar, Anna Khoreva

2020-11-25BenchmarkingData AugmentationImage GenerationImage-to-Image Translation
PaperPDF

Abstract

Despite data augmentation being a de facto technique for boosting the performance of deep neural networks, little attention has been paid to developing augmentation strategies for generative adversarial networks (GANs). To this end, we introduce a novel augmentation scheme designed specifically for GAN-based semantic image synthesis models. We propose to randomly warp object shapes in the semantic label maps used as an input to the generator. The local shape discrepancies between the warped and non-warped label maps and images enable the GAN to learn better the structural and geometric details of the scene and thus to improve the quality of generated images. While benchmarking the augmented GAN models against their vanilla counterparts, we discover that the quantification metrics reported in the previous semantic image synthesis studies are strongly biased towards specific semantic classes as they are derived via an external pre-trained segmentation network. We therefore propose to improve the established semantic image synthesis evaluation scheme by analyzing separately the performance of generated images on the biased and unbiased classes for the given segmentation network. Finally, we show strong quantitative and qualitative improvements obtained with our augmentation scheme, on both class splits, using state-of-the-art semantic image synthesis models across three different datasets. On average across COCO-Stuff, ADE20K and Cityscapes datasets, the augmented models outperform their vanilla counterparts by ~3 mIoU and ~10 FID points.

Results

TaskDatasetMetricValueModel
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosAccuracy71.5CC-FPSE-AUG
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosFID19.1CC-FPSE-AUG
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosmIoU42.1CC-FPSE-AUG
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosAccuracy54.1Pix2PixHD-AUG
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosFID54.2Pix2PixHD-AUG
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosmIoU21.9Pix2PixHD-AUG
Image-to-Image TranslationCityscapes Labels-to-PhotoAccuracy93.5CC-FPSE-AUG
Image-to-Image TranslationCityscapes Labels-to-PhotoFID52.1CC-FPSE-AUG
Image-to-Image TranslationCityscapes Labels-to-PhotomIoU63.1CC-FPSE-AUG
Image-to-Image TranslationCityscapes Labels-to-PhotoAccuracy92.7Pix2PixHD-AUG
Image-to-Image TranslationCityscapes Labels-to-PhotoFID72.7Pix2PixHD-AUG
Image-to-Image TranslationCityscapes Labels-to-PhotomIoU58Pix2PixHD-AUG
Image-to-Image TranslationADE20K Labels-to-PhotosFID32.6CC-FPSE-AUG
Image-to-Image TranslationADE20K Labels-to-PhotosmIoU44CC-FPSE-AUG
Image-to-Image TranslationADE20K Labels-to-PhotosFID41.5Pix2PixHD-AUG
Image GenerationCOCO-Stuff Labels-to-PhotosAccuracy71.5CC-FPSE-AUG
Image GenerationCOCO-Stuff Labels-to-PhotosFID19.1CC-FPSE-AUG
Image GenerationCOCO-Stuff Labels-to-PhotosmIoU42.1CC-FPSE-AUG
Image GenerationCOCO-Stuff Labels-to-PhotosAccuracy54.1Pix2PixHD-AUG
Image GenerationCOCO-Stuff Labels-to-PhotosFID54.2Pix2PixHD-AUG
Image GenerationCOCO-Stuff Labels-to-PhotosmIoU21.9Pix2PixHD-AUG
Image GenerationCityscapes Labels-to-PhotoAccuracy93.5CC-FPSE-AUG
Image GenerationCityscapes Labels-to-PhotoFID52.1CC-FPSE-AUG
Image GenerationCityscapes Labels-to-PhotomIoU63.1CC-FPSE-AUG
Image GenerationCityscapes Labels-to-PhotoAccuracy92.7Pix2PixHD-AUG
Image GenerationCityscapes Labels-to-PhotoFID72.7Pix2PixHD-AUG
Image GenerationCityscapes Labels-to-PhotomIoU58Pix2PixHD-AUG
Image GenerationADE20K Labels-to-PhotosFID32.6CC-FPSE-AUG
Image GenerationADE20K Labels-to-PhotosmIoU44CC-FPSE-AUG
Image GenerationADE20K Labels-to-PhotosFID41.5Pix2PixHD-AUG
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosAccuracy71.5CC-FPSE-AUG
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosFID19.1CC-FPSE-AUG
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosmIoU42.1CC-FPSE-AUG
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosAccuracy54.1Pix2PixHD-AUG
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosFID54.2Pix2PixHD-AUG
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosmIoU21.9Pix2PixHD-AUG
1 Image, 2*2 StitchingCityscapes Labels-to-PhotoAccuracy93.5CC-FPSE-AUG
1 Image, 2*2 StitchingCityscapes Labels-to-PhotoFID52.1CC-FPSE-AUG
1 Image, 2*2 StitchingCityscapes Labels-to-PhotomIoU63.1CC-FPSE-AUG
1 Image, 2*2 StitchingCityscapes Labels-to-PhotoAccuracy92.7Pix2PixHD-AUG
1 Image, 2*2 StitchingCityscapes Labels-to-PhotoFID72.7Pix2PixHD-AUG
1 Image, 2*2 StitchingCityscapes Labels-to-PhotomIoU58Pix2PixHD-AUG
1 Image, 2*2 StitchingADE20K Labels-to-PhotosFID32.6CC-FPSE-AUG
1 Image, 2*2 StitchingADE20K Labels-to-PhotosmIoU44CC-FPSE-AUG
1 Image, 2*2 StitchingADE20K Labels-to-PhotosFID41.5Pix2PixHD-AUG

Related Papers

Visual Place Recognition for Large-Scale UAV Applications2025-07-20Training Transformers with Enforced Lipschitz Constants2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17