TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Simple diffusion: End-to-end diffusion for high resolution...

Simple diffusion: End-to-end diffusion for high resolution images

Emiel Hoogeboom, Jonathan Heek, Tim Salimans

2023-01-26DenoisingSuper-ResolutionText-to-Image GenerationVocal Bursts Intensity PredictionImage GenerationConditional Image Generation
PaperPDFCode

Abstract

Currently, applying diffusion models in pixel space of high resolution images is difficult. Instead, existing approaches focus on diffusion in lower dimensional spaces (latent diffusion), or have multiple super-resolution levels of generation referred to as cascades. The downside is that these approaches add additional complexity to the diffusion framework. This paper aims to improve denoising diffusion for high resolution images while keeping the model as simple as possible. The paper is centered around the research question: How can one train a standard denoising diffusion models on high resolution images, and still obtain performance comparable to these alternate approaches? The four main findings are: 1) the noise schedule should be adjusted for high resolution images, 2) It is sufficient to scale only a particular part of the architecture, 3) dropout should be added at specific locations in the architecture, and 4) downsampling is an effective strategy to avoid high resolution feature maps. Combining these simple yet effective techniques, we achieve state-of-the-art on image generation among diffusion models without sampling modifiers on ImageNet.

Results

TaskDatasetMetricValueModel
Image GenerationImageNet 512x512FID4.28simple diffusion (U-Net)
Image GenerationImageNet 512x512Inception score171simple diffusion (U-Net)
Image GenerationImageNet 512x512FID4.53simple diffusion (U-ViT, L)
Image GenerationImageNet 512x512Inception score205.3simple diffusion (U-ViT, L)
Image GenerationImageNet 256x256FID3.71simple diffusion (U-Net)
Image GenerationImageNet 256x256FID3.75simple diffusion (U-ViT, L)
Image GenerationCOCO (Common Objects in Context)FID8.3simple diffusion (U-ViT)
Image GenerationImageNet 128x128FID2.88simple diffusion (U-Net)
Image GenerationImageNet 128x128Inception score137.3simple diffusion (U-Net)
Image GenerationImageNet 128x128FID3.23simple diffusion (U-ViT, L)
Image GenerationImageNet 128x128Inception score171.9simple diffusion (U-ViT, L)
Text-to-Image GenerationCOCO (Common Objects in Context)FID8.3simple diffusion (U-ViT)
Conditional Image GenerationImageNet 128x128FID2.88simple diffusion (U-Net)
Conditional Image GenerationImageNet 128x128Inception score137.3simple diffusion (U-Net)
Conditional Image GenerationImageNet 128x128FID3.23simple diffusion (U-ViT, L)
Conditional Image GenerationImageNet 128x128Inception score171.9simple diffusion (U-ViT, L)
10-shot image generationCOCO (Common Objects in Context)FID8.3simple diffusion (U-ViT)
1 Image, 2*2 StitchiCOCO (Common Objects in Context)FID8.3simple diffusion (U-ViT)

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17SpectraLift: Physics-Guided Spectral-Inversion Network for Self-Supervised Hyperspectral Image Super-Resolution2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16