TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video Captioning/ActivityNet Captions

Video Captioning on ActivityNet Captions

Metric: METEOR (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕METEOR▼Extra DataPaperDate↕Code
1VLTinT (ae-test split) C3D/Ling17.97NoVLTinT: Visual-Linguistic Transformer-in-Transfo...2022-11-28Code
2VLCap (ae-test split) - Appearance + Language17.48NoVLCap: Vision-Language with Contrastive Learning...2022-06-26Code
3Vid2Seq17YesVid2Seq: Large-Scale Pretraining of a Visual Lan...2023-02-27Code
4ADV-INF + Global16.36No--Code
5COOT (ae-test split) - Only Appearance features15.99NoCOOT: Cooperative Hierarchical Transformer for V...2020-11-01Code
6MART (ae-test split) - Appearance + Flow15.68NoMART: Memory-Augmented Recurrent Transformer for...2020-05-11Code
7Bi-directional+intra captioning11.28NoTeam RUC_AIM3 Technical Report at Activitynet 20...2020-06-14-
8GVL10.03NoLearning Grounded Vision-Language Representation...2023-03-11Code
9TSRM-CMG-HRNN+SCST9.71NoDense-Captioning Events in Videos: SYSU Submissi...2020-06-21Code
10PDVC (TSP features, no SCST)9.03NoEnd-to-End Dense Video Captioning with Parallel ...2021-08-17Code
11TSP8.75NoTSP: Temporally-Sensitive Pretraining of Video E...2020-11-23Code
12CM²8.55NoDo You Remember? Dense Video Captioning with Cro...2024-04-11Code
13BMT8.44NoA Better Use of Audio-Visual Cues: Dense Video C...2020-05-17Code
14iPerceive (Chadha et al., 2020)7.87NoiPerceive: Applying Common-Sense Reasoning to Mu...2020-11-16-
15MDVC7.31NoMulti-modal Dense Video Captioning2020-03-17Code