TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Robots/Activity Recognition/EPIC-KITCHENS-100

Activity Recognition on EPIC-KITCHENS-100

Metric: Noun@1 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Noun@1▼Extra DataPaperDate↕Code
1LLaVAction69YesLLaVAction: evaluating and training multi-modal ...2025-03-24Code
2TIM66.4YesTIM: A Time Interval Machine for Audio-Visual Ac...2024-04-08Code
3M&M (WTS 60M)66.3YesM&M Mix: A Multimodal Multiview Transformer Ense...2022-06-20-
4Avion (ViT-L)65.4YesTraining a Large Video Model on a Single Machine...2023-09-28Code
5TAdaFormer-L/1464.1YesTemporally-Adaptive Models for Efficient Video U...2023-08-10Code
6MTV-B (WTS 60M)63.9YesMultiview Transformers for Video Recognition2022-01-12Code
7LaViLa (TimeSformer-L)62.9YesLearning Video Representations from Large Langua...2022-12-08Code
8LVMAE61.8YesExtending Video Masked Autoencoders to 128 frames2024-11-20-
9OMNIVORE (Swin-B, finetuned)61.7YesOmnivore: A Single Model for Many Visual Modalit...2022-01-20Code
10MMT61No---
11CAST(ViT-B/16)60.9NoCAST: Cross-Attention in Space and Time for Vide...2023-11-30Code
12MeMViT-2460.3YesMeMViT: Memory-Augmented Multiscale Vision Trans...2022-01-20Code
13TAdaConvNeXtV2-S60.2YesTemporally-Adaptive Models for Efficient Video U...2023-08-10Code
14AVT59.3No---
15ORViT Mformer-L (ORViT blocks)58.7NoObject-Region Video Transformers2021-10-13Code
16Mformer-HR58.5YesKeeping Your Eye on the Ball: Trajectory Attenti...2021-06-09Code
17MBT58NoAttention Bottlenecks for Multimodal Fusion2021-06-30Code
18Mformer-L57.6YesKeeping Your Eye on the Ball: Trajectory Attenti...2021-06-09Code
19MoViNet-A657.3NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
20ViViT-L/16x2 Fact. encoder56.8NoViViT: A Video Vision Transformer2021-03-29Code
21Mformer56.5YesKeeping Your Eye on the Ball: Trajectory Attenti...2021-06-09Code
22MoViNet-A456.2NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
23MoViNet-A555.1NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
24TempAgg53.35NoTechnical Report: Temporal Aggregate Representat...2021-06-06Code
25GSF53.18YesGate-Shift-Fuse for Video Action Recognition2022-03-16Code
26MoViNet-A252.3NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code
27MoViNet-A047.4NoMoViNets: Mobile Video Networks for Efficient Vi...2021-03-21Code