TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Learning Canonical Representations for Scene Graph to Imag...

Learning Canonical Representations for Scene Graph to Image Generation

Roei Herzig, Amir Bar, Huijuan Xu, Gal Chechik, Trevor Darrell, Amir Globerson

2019-12-16ECCV 2020 8Scene GenerationLayout-to-Image GenerationImage Generation
PaperPDFCode(official)Code

Abstract

Generating realistic images of complex visual scenes becomes challenging when one wishes to control the structure of the generated images. Previous approaches showed that scenes with few entities can be controlled using scene graphs, but this approach struggles as the complexity of the graph (the number of objects and edges) increases. In this work, we show that one limitation of current methods is their inability to capture semantic equivalence in graphs. We present a novel model that addresses these issues by learning canonical graph representations from the data, resulting in improved image generation for complex visual scenes. Our model demonstrates improved empirical performance on large scene graphs, robustness to noise in the input scene graph, and generalization on semantically equivalent graphs. Finally, we show improved performance of the model on three different benchmarks: Visual Genome, COCO, and CLEVR.

Results

TaskDatasetMetricValueModel
Image GenerationCOCO-Stuff 256x256FID54.7AttSPADE
Image GenerationCOCO-Stuff 256x256Inception Score15.6AttSPADE
Image GenerationCOCO-Stuff 256x256LPIPS0.44AttSPADE
Image GenerationVisual Genome 256x256FID36.4AttSPADE
Image GenerationVisual Genome 256x256Inception Score11AttSPADE
Image GenerationVisual Genome 256x256LPIPS0.51AttSPADE

Related Papers

World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17FADE: Adversarial Concept Erasure in Flow Models2025-07-16CharaConsist: Fine-Grained Consistent Character Generation2025-07-15