TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MIGC: Multi-Instance Generation Controller for Text-to-Ima...

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

Dewei Zhou, You Li, Fan Ma, Xiaoting Zhang, Yi Yang

2024-02-08CVPR 2024 1AttributeConditional Text-to-Image SynthesisImage Generation
PaperPDFCode(official)

Abstract

We present a Multi-Instance Generation (MIG) task, simultaneously generating multiple instances with diverse controls in one image. Given a set of predefined coordinates and their corresponding descriptions, the task is to ensure that generated instances are accurately at the designated locations and that all instances' attributes adhere to their corresponding description. This broadens the scope of current research on Single-instance generation, elevating it to a more versatile and practical dimension. Inspired by the idea of divide and conquer, we introduce an innovative approach named Multi-Instance Generation Controller (MIGC) to address the challenges of the MIG task. Initially, we break down the MIG task into several subtasks, each involving the shading of a single instance. To ensure precise shading for each instance, we introduce an instance enhancement attention mechanism. Lastly, we aggregate all the shaded instances to provide the necessary information for accurately generating multiple instances in stable diffusion (SD). To evaluate how well generation models perform on the MIG task, we provide a COCO-MIG benchmark along with an evaluation pipeline. Extensive experiments were conducted on the proposed COCO-MIG benchmark, as well as on various commonly used benchmarks. The evaluation results illustrate the exceptional control capabilities of our model in terms of quantity, position, attribute, and interaction. Code and demos will be released at https://migcproject.github.io/.

Results

TaskDatasetMetricValueModel
Image GenerationCOCO-MIGinstance success rate0.66MIGC
Image GenerationCOCO-MIGmIoU0.56MIGC
Text-to-Image GenerationCOCO-MIGinstance success rate0.66MIGC
Text-to-Image GenerationCOCO-MIGmIoU0.56MIGC
10-shot image generationCOCO-MIGinstance success rate0.66MIGC
10-shot image generationCOCO-MIGmIoU0.56MIGC
1 Image, 2*2 StitchiCOCO-MIGinstance success rate0.66MIGC
1 Image, 2*2 StitchiCOCO-MIGmIoU0.56MIGC

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17MGFFD-VLM: Multi-Granularity Prompt Learning for Face Forgery Detection with VLM2025-07-16Non-Adaptive Adversarial Face Generation2025-07-16FADE: Adversarial Concept Erasure in Flow Models2025-07-16