Metric: TextScenesHQ OCR (Cer) (lower is better)
| # | Model↕ | TextScenesHQ OCR (Cer)▲ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | Grok3 | 0.57 | No | - | - | - |
| 2 | SD3.5 Large | 0.73 | No | - | - | - |
| 3 | Infinity-2B | 0.88 | No | Infinity-MM: Scaling Multimodal Performance with... | 2024-10-24 | Code |
| 4 | PixArt-Sigma | 0.91 | No | PixArt-Σ: Weak-to-Strong Training of Diffusion T... | 2024-03-07 | Code |
| 5 | Anytext | 0.95 | No | AnyText: Multilingual Visual Text Generation And... | 2023-11-06 | Code |
| 6 | TextDiffuser2 | 0.96 | No | TextDiffuser-2: Unleashing the Power of Language... | 2023-11-28 | - |