TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Time Series/Action Recognition/EPIC-KITCHENS-100

Action Recognition on EPIC-KITCHENS-100

Metric: Action@1 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Action@1▼Extra DataPaperDate↕Code
1LLaVAction58.3YesLLaVAction: evaluating and training multi-modal ...2025-03-24Code
2TIM56.4YesTIM: A Time Interval Machine for Audio-Visual Ac...2024-04-08Code
3Avion (ViT-L)54.4YesTraining a Large Video Model on a Single Machine...2023-09-28Code
4M&M (WTS 60M)53.6YesM&M Mix: A Multimodal Multiview Transformer Ense...2022-06-20-
5LVMAE52.1YesExtending Video Masked Autoencoders to 128 frames2024-11-20-
6TAdaFormer-L/1451.8YesTemporally-Adaptive Models for Efficient Video U...2023-08-10Code
7LaViLa (TimeSformer-L)51YesLearning Video Representations from Large Langua...2022-12-08Code
8MTV-B (WTS 60M)50.5YesMultiview Transformers for Video Recognition2022-01-12Code
9OMNIVORE (Swin-B, finetuned)49.9YesOmnivore: A Single Model for Many Visual Modalit...2022-01-20Code
10CAST(ViT-B/16)49.3NoCAST: Cross-Attention in Space and Time for Vide...2023-11-30Code
11TAdaConvNeXtV2-S48.9YesTemporally-Adaptive Models for Efficient Video U...2023-08-10Code
12MeMViT-2448.4YesMeMViT: Memory-Augmented Multiscale Vision Trans...2022-01-20Code
13MMT47.8No---
14MoViNet-A647.7NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
15AVT47.2No---
16ORViT Mformer-L (ORViT blocks)45.7NoObject-Region Video Transformers2021-10-13Code
17TempAgg45.26NoTechnical Report: Temporal Aggregate Representat...2021-06-06Code
18MoViNet-A544.5NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
19Mformer-HR44.5YesKeeping Your Eye on the Ball: Trajectory Attenti...2021-06-09Code
20GSF44.48YesGate-Shift-Fuse for Video Action Recognition2022-03-16Code
21MoViNet-A444.4NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
22Mformer-L44.1YesKeeping Your Eye on the Ball: Trajectory Attenti...2021-06-09Code
23ViViT-L/16x2 Fact. encoder44NoViViT: A Video Vision Transformer2021-03-29Code
24MBT43.4NoAttention Bottlenecks for Multimodal Fusion2021-06-30Code
25Mformer43.1YesKeeping Your Eye on the Ball: Trajectory Attenti...2021-06-09Code
26MoViNet-A241.2NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
27TSM37.39NoRescaling Egocentric Vision2020-06-23Code
28SlowFast36.81NoRescaling Egocentric Vision2020-06-23Code
29MoViNet-A036.8NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
30TBN35.55NoRescaling Egocentric Vision2020-06-23Code
31TRN35.28NoRescaling Egocentric Vision2020-06-23Code
32TSN33.57NoRescaling Egocentric Vision2020-06-23Code