TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Time Series/Action Recognition/EPIC-KITCHENS-100

Action Recognition on EPIC-KITCHENS-100

Metric: Verb@1 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Verb@1▼Extra DataPaperDate↕Code
1TIM76.2YesTIM: A Time Interval Machine for Audio-Visual Ac...2024-04-08Code
2LLaVAction76YesLLaVAction: evaluating and training multi-modal ...2025-03-24Code
3LVMAE75YesExtending Video Masked Autoencoders to 128 frames2024-11-20-
4Avion (ViT-L)73YesTraining a Large Video Model on a Single Machine...2023-09-28Code
5CAST(ViT-B/16)72.5NoCAST: Cross-Attention in Space and Time for Vide...2023-11-30Code
6MoViNet-A672.2NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
7M&M (WTS 60M)72YesM&M Mix: A Multimodal Multiview Transformer Ense...2022-06-20-
8LaViLa (TimeSformer-L)72YesLearning Video Representations from Large Langua...2022-12-08Code
9TAdaFormer-L/1471.7YesTemporally-Adaptive Models for Efficient Video U...2023-08-10Code
10MeMViT-2471.4YesMeMViT: Memory-Augmented Multiscale Vision Trans...2022-01-20Code
11TAdaConvNeXtV2-S71YesTemporally-Adaptive Models for Efficient Video U...2023-08-10Code
12AVT70.4No---
13MMT70.1No---
14MTV-B (WTS 60M)69.9YesMultiview Transformers for Video Recognition2022-01-12Code
15OMNIVORE (Swin-B, finetuned)69.5YesOmnivore: A Single Model for Many Visual Modalit...2022-01-20Code
16MoViNet-A569.1NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
17GSF69.06YesGate-Shift-Fuse for Video Action Recognition2022-03-16Code
18MoViNet-A468.8NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
19ORViT Mformer-L (ORViT blocks)68.4NoObject-Region Video Transformers2021-10-13Code
20Mformer-L67.1YesKeeping Your Eye on the Ball: Trajectory Attenti...2021-06-09Code
21MoViNet-A267.1NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
22Mformer-HR67YesKeeping Your Eye on the Ball: Trajectory Attenti...2021-06-09Code
23Mformer66.7YesKeeping Your Eye on the Ball: Trajectory Attenti...2021-06-09Code
24ViViT-L/16x2 Fact. encoder66.4NoViViT: A Video Vision Transformer2021-03-29Code
25TempAgg66NoTechnical Report: Temporal Aggregate Representat...2021-06-06Code
26MBT64.8NoAttention Bottlenecks for Multimodal Fusion2021-06-30Code
27MoViNet-A064.8NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code