Text-to-Video Generation on MSR-VTT

Metric: CLIP-FID (lower is better)

LeaderboardDataset
Loading chart...