TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Data Extrapolation for Text-to-image Generation on Small D...

Data Extrapolation for Text-to-image Generation on Small Datasets

Senmao Ye, Fei Liu

2024-10-02Text-to-Image GenerationData AugmentationText to Image GenerationImage Generation
PaperPDFCode(official)

Abstract

Text-to-image generation requires large amount of training data to synthesizing high-quality images. For augmenting training data, previous methods rely on data interpolations like cropping, flipping, and mixing up, which fail to introduce new information and yield only marginal improvements. In this paper, we propose a new data augmentation method for text-to-image generation using linear extrapolation. Specifically, we apply linear extrapolation only on text feature, and new image data are retrieved from the internet by search engines. For the reliability of new text-image pairs, we design two outlier detectors to purify retrieved images. Based on extrapolation, we construct training samples dozens of times larger than the original dataset, resulting in a significant improvement in text-to-image performance. Moreover, we propose a NULL-guidance to refine score estimation, and apply recurrent affine transformation to fuse text information. Our model achieves FID scores of 7.91, 9.52 and 5.00 on the CUB, Oxford and COCO datasets. The code and data will be available on GitHub (https://github.com/senmaoy/RAT-Diffusion).

Results

TaskDatasetMetricValueModel
Image GenerationCOCO (Common Objects in Context)FID5RAT-Diffusion
Image GenerationOxford 102 FlowersFID9.52RAT-Diffusion
Image GenerationOxford 102 FlowersInception score4.35RAT-Diffusion
Image GenerationCUBFID6.36RAT-Diffusion
Image GenerationCUBInception score6.56RAT-Diffusion
Text-to-Image GenerationCOCO (Common Objects in Context)FID5RAT-Diffusion
Text-to-Image GenerationOxford 102 FlowersFID9.52RAT-Diffusion
Text-to-Image GenerationOxford 102 FlowersInception score4.35RAT-Diffusion
Text-to-Image GenerationCUBFID6.36RAT-Diffusion
Text-to-Image GenerationCUBInception score6.56RAT-Diffusion
10-shot image generationCOCO (Common Objects in Context)FID5RAT-Diffusion
10-shot image generationOxford 102 FlowersFID9.52RAT-Diffusion
10-shot image generationOxford 102 FlowersInception score4.35RAT-Diffusion
10-shot image generationCUBFID6.36RAT-Diffusion
10-shot image generationCUBInception score6.56RAT-Diffusion
1 Image, 2*2 StitchiCOCO (Common Objects in Context)FID5RAT-Diffusion
1 Image, 2*2 StitchiOxford 102 FlowersFID9.52RAT-Diffusion
1 Image, 2*2 StitchiOxford 102 FlowersInception score4.35RAT-Diffusion
1 Image, 2*2 StitchiCUBFID6.36RAT-Diffusion
1 Image, 2*2 StitchiCUBInception score6.56RAT-Diffusion

Related Papers

Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16FADE: Adversarial Concept Erasure in Flow Models2025-07-16