TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video Captioning/ActivityNet Captions

Video Captioning on ActivityNet Captions

Metric: ROUGE-L (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕ROUGE-L▼Extra DataPaperDate↕Code
1VLTinT (ae-test split) C3D/Ling36.56NoVLTinT: Visual-Linguistic Transformer-in-Transfo...2022-11-28Code
2VLCap (ae-test split) - Appearance + Language35.99NoVLCap: Vision-Language with Contrastive Learning...2022-06-26Code
3VideoCoCa35YesVideoCoCa: Video-Text Modeling with Zero-Shot Tr...2022-12-09-
4COOT (ae-test split) - Only Appearance features31.45NoCOOT: Cooperative Hierarchical Transformer for V...2020-11-01Code