Improving Augmentation and Evaluation Schemes for Semantic Image Synthesis

Prateek Katiyar, Anna Khoreva

2020-11-25Benchmarking Data Augmentation Image Generation Image-to-Image Translation

Abstract

Despite data augmentation being a de facto technique for boosting the performance of deep neural networks, little attention has been paid to developing augmentation strategies for generative adversarial networks (GANs). To this end, we introduce a novel augmentation scheme designed specifically for GAN-based semantic image synthesis models. We propose to randomly warp object shapes in the semantic label maps used as an input to the generator. The local shape discrepancies between the warped and non-warped label maps and images enable the GAN to learn better the structural and geometric details of the scene and thus to improve the quality of generated images. While benchmarking the augmented GAN models against their vanilla counterparts, we discover that the quantification metrics reported in the previous semantic image synthesis studies are strongly biased towards specific semantic classes as they are derived via an external pre-trained segmentation network. We therefore propose to improve the established semantic image synthesis evaluation scheme by analyzing separately the performance of generated images on the biased and unbiased classes for the given segmentation network. Finally, we show strong quantitative and qualitative improvements obtained with our augmentation scheme, on both class splits, using state-of-the-art semantic image synthesis models across three different datasets. On average across COCO-Stuff, ADE20K and Cityscapes datasets, the augmented models outperform their vanilla counterparts by ~3 mIoU and ~10 FID points.

Results

Task	Dataset	Metric	Value	Model
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	Accuracy	71.5	CC-FPSE-AUG
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	FID	19.1	CC-FPSE-AUG
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	mIoU	42.1	CC-FPSE-AUG
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	Accuracy	54.1	Pix2PixHD-AUG
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	FID	54.2	Pix2PixHD-AUG
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	mIoU	21.9	Pix2PixHD-AUG
Image-to-Image Translation	Cityscapes Labels-to-Photo	Accuracy	93.5	CC-FPSE-AUG
Image-to-Image Translation	Cityscapes Labels-to-Photo	FID	52.1	CC-FPSE-AUG
Image-to-Image Translation	Cityscapes Labels-to-Photo	mIoU	63.1	CC-FPSE-AUG
Image-to-Image Translation	Cityscapes Labels-to-Photo	Accuracy	92.7	Pix2PixHD-AUG
Image-to-Image Translation	Cityscapes Labels-to-Photo	FID	72.7	Pix2PixHD-AUG
Image-to-Image Translation	Cityscapes Labels-to-Photo	mIoU	58	Pix2PixHD-AUG
Image-to-Image Translation	ADE20K Labels-to-Photos	FID	32.6	CC-FPSE-AUG
Image-to-Image Translation	ADE20K Labels-to-Photos	mIoU	44	CC-FPSE-AUG
Image-to-Image Translation	ADE20K Labels-to-Photos	FID	41.5	Pix2PixHD-AUG
Image Generation	COCO-Stuff Labels-to-Photos	Accuracy	71.5	CC-FPSE-AUG
Image Generation	COCO-Stuff Labels-to-Photos	FID	19.1	CC-FPSE-AUG
Image Generation	COCO-Stuff Labels-to-Photos	mIoU	42.1	CC-FPSE-AUG
Image Generation	COCO-Stuff Labels-to-Photos	Accuracy	54.1	Pix2PixHD-AUG
Image Generation	COCO-Stuff Labels-to-Photos	FID	54.2	Pix2PixHD-AUG
Image Generation	COCO-Stuff Labels-to-Photos	mIoU	21.9	Pix2PixHD-AUG
Image Generation	Cityscapes Labels-to-Photo	Accuracy	93.5	CC-FPSE-AUG
Image Generation	Cityscapes Labels-to-Photo	FID	52.1	CC-FPSE-AUG
Image Generation	Cityscapes Labels-to-Photo	mIoU	63.1	CC-FPSE-AUG
Image Generation	Cityscapes Labels-to-Photo	Accuracy	92.7	Pix2PixHD-AUG
Image Generation	Cityscapes Labels-to-Photo	FID	72.7	Pix2PixHD-AUG
Image Generation	Cityscapes Labels-to-Photo	mIoU	58	Pix2PixHD-AUG
Image Generation	ADE20K Labels-to-Photos	FID	32.6	CC-FPSE-AUG
Image Generation	ADE20K Labels-to-Photos	mIoU	44	CC-FPSE-AUG
Image Generation	ADE20K Labels-to-Photos	FID	41.5	Pix2PixHD-AUG
1 Image, 2*2 Stitching	COCO-Stuff Labels-to-Photos	Accuracy	71.5	CC-FPSE-AUG
1 Image, 2*2 Stitching	COCO-Stuff Labels-to-Photos	FID	19.1	CC-FPSE-AUG
1 Image, 2*2 Stitching	COCO-Stuff Labels-to-Photos	mIoU	42.1	CC-FPSE-AUG
1 Image, 2*2 Stitching	COCO-Stuff Labels-to-Photos	Accuracy	54.1	Pix2PixHD-AUG
1 Image, 2*2 Stitching	COCO-Stuff Labels-to-Photos	FID	54.2	Pix2PixHD-AUG
1 Image, 2*2 Stitching	COCO-Stuff Labels-to-Photos	mIoU	21.9	Pix2PixHD-AUG
1 Image, 2*2 Stitching	Cityscapes Labels-to-Photo	Accuracy	93.5	CC-FPSE-AUG
1 Image, 2*2 Stitching	Cityscapes Labels-to-Photo	FID	52.1	CC-FPSE-AUG
1 Image, 2*2 Stitching	Cityscapes Labels-to-Photo	mIoU	63.1	CC-FPSE-AUG
1 Image, 2*2 Stitching	Cityscapes Labels-to-Photo	Accuracy	92.7	Pix2PixHD-AUG
1 Image, 2*2 Stitching	Cityscapes Labels-to-Photo	FID	72.7	Pix2PixHD-AUG
1 Image, 2*2 Stitching	Cityscapes Labels-to-Photo	mIoU	58	Pix2PixHD-AUG
1 Image, 2*2 Stitching	ADE20K Labels-to-Photos	FID	32.6	CC-FPSE-AUG
1 Image, 2*2 Stitching	ADE20K Labels-to-Photos	mIoU	44	CC-FPSE-AUG
1 Image, 2*2 Stitching	ADE20K Labels-to-Photos	FID	41.5	Pix2PixHD-AUG

Improving Augmentation and Evaluation Schemes for Semantic Image Synthesis

Abstract

Results

Related Papers

Improving Augmentation and Evaluation Schemes for Semantic Image Synthesis

Abstract

Results

Related Papers