Dense Video Captioning on ViTT

Metric: CIDEr (higher is better)

LeaderboardDataset
Loading chart...