Metric: FID (lower is better)
| # | Model↕ | FID▲ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | TF-T2V | 8.19 | No | A Recipe for Scaling up Text-to-Video Generation... | 2023-12-25 | Code |
| 2 | HiGen | 8.6 | No | Hierarchical Spatio-temporal Decoupling for Text... | 2023-12-07 | Code |
| 3 | ModelScopeT2V | 11.09 | No | ModelScope Text-to-Video Technical Report | 2023-08-12 | Code |
| 4 | Video-LaVIT | 11.27 | No | Video-LaVIT: Unified Video-Language Pre-training... | 2024-02-05 | Code |
| 5 | Show-1 | 13.08 | No | Show-1: Marrying Pixel and Latent Diffusion Mode... | 2023-09-27 | Code |
| 6 | Make-A-Video | 13.17 | No | Make-A-Video: Text-to-Video Generation without T... | 2022-09-29 | Code |
| 7 | MMVG | 23.4 | No | Tell Me What Happened: Unifying Text-guided Vide... | 2022-11-23 | Code |
| 8 | CogVideo (English) | 23.59 | No | Make-A-Video: Text-to-Video Generation without T... | 2022-09-29 | Code |
| 9 | MagicVideo | 36.5 | No | MagicVideo: Efficient Video Generation With Late... | 2022-11-20 | - |
| 10 | NUWA | 47.68 | No | NÜWA: Visual Synthesis Pre-training for Neural v... | 2021-11-24 | Code |