TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/PixArt-Sigma

PixArt-Sigma

Reported on 15 benchmarks across 1 task · 1 paper · 11 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Medical15 results

  • Image GenerationonTextAtlasEval
    StyledTextSynth Clip Score· 2024-03-07
    0.2764
    best: 0.2938 (Grok3)
    SOTA
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    StyledTextSynth FID· 2024-03-07
    82.83
    best: 71.09 (SD3.5 Large)
    SOTA
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    StyledTextSynth OCR (Cer)· 2024-03-07
    0.9
    best: 0.73 (Grok3)
    SOTA
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    TextScenesHQ Clip Score· 2024-03-07
    0.2347
    best: 0.3367 (Dalle3)
    SOTA
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    TextScenesHQ FID· 2024-03-07
    72.62
    best: 64.44 (SD3.5 Large)
    SOTA
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    TextScenesHQ OCR (Cer)· 2024-03-07
    0.91
    best: 0.57 (Grok3)
    SOTA
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    TextVisionBlend Clip Score· 2024-03-07
    0.1891
    best: 0.1979 (Infinity-2B)
    SOTA
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    TextVisionBlend FID· 2024-03-07
    81.29
    SOTA
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    TextVisionBlend OCR (Accuracy)· 2024-03-07
    2.4
    best: 41.54 (Grok3)
    SOTA
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    TextVisionBlend OCR (Cer)· 2024-03-07
    0.83
    best: 0.57 (Grok3)
    SOTA
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    TextVsionBlend OCR (F1 Score)· 2024-03-07
    1.57
    best: 44.22 (Grok3)
    SOTA
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    StyledTextSynth OCR (Accuracy)· 2024-03-07
    0.42
    best: 30.58 (Dalle3)
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    StyledTextSynth OCR (F1 Score)· 2024-03-07
    0.62
    best: 38.25 (Dalle3)
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    TextScenesHQ OCR (Accuracy)· 2024-03-07
    0.34
    best: 69.26 (Dalle3)
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692
  • Image GenerationonTextAtlasEval
    TextScenesHQ OCR (F1 Score)· 2024-03-07
    0.53
    best: 51.63 (Dalle3)
    PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationarXiv:2403.04692