TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Time Series/Action Recognition/Something-Something V1

Action Recognition on Something-Something V1

Metric: Top 5 Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Top 5 Accuracy▼Extra DataPaperDate↕Code
1VideoMAE V2-g91.9YesVideoMAE V2: Scaling Video Masked Autoencoders w...2023-03-29Code
2Side4Video (EVA ViT-E/1488.8NoSide4Video: Spatial-Temporal Side Network for Me...2023-11-27Code
3ATM88.6NoWhat Can Simple Arithmetic Operations Do for Tem...2023-07-18Code
4UniFormerV2-L88Yes--Code
5TDS-CLIP-ViT-L/14(8frames)87.8NoTDS-CLIP: Temporal Difference Side Network for I...2024-08-20Code
6UniFormer-B (IN-1K + Kinetics400)87.3No--Code
7TRG (ResNet-50)86.1NoTemporal Reasoning Graph for Activity Recognition2019-08-27-
8UniFormer-B (IN-1K + Kinetics600)84.9No--Code
9SELFYNet-TSM-R50En (8+16 frames, ImageNet pretrained, 2 clips)84.4YesLearning Self-Similarity in Space and Time as Ge...2021-02-14Code
10BQNEn (ImageNet + K400 pretrained)84.2NoBusy-Quiet Video Disentangling for Video Classif...2021-03-29Code
11TDN ResNet101 (one clip, center crop, 8+16 ensemble, ImageNet pretrained, RGB only)84.1NoTDN: Temporal Difference Networks for Efficient ...2020-12-18Code
12EAN ResNet50 (single clip, center crop,8+16 ensemble, with sparse Transformer)83.9NoEAN: Event Adaptive Network for Enhanced Action ...2021-07-22Code
13SELFYNet-TSM-R50En (8+16 frames, ImageNet pretrained, a single clip)83.9YesLearning Self-Similarity in Space and Time as Ge...2021-02-14Code
14MSNet-R50En (8+16 ensemble, ImageNet pretrained)83.8YesMotionSqueeze: Neural Motion Feature Learning fo...2020-07-20Code
15SELFYNet-TSM-R50 (16 frames, ImageNet pretrained)82.9YesLearning Self-Similarity in Space and Time as Ge...2021-02-14Code
16RSANet-R50 (8+16 frames, ImageNet pretrained, 2 clips)82.8NoRelational Self-Attention: What's Missing in Att...2021-11-02Code
17PAN ResNet101 (RGB only, no Flow)82.8NoPAN: Towards Fast Action Recognition via Learnin...2020-08-08Code
18RSANet-R50 (8+16 frames, ImageNet pretrained, a single clip)82.6NoRelational Self-Attention: What's Missing in Att...2021-11-02Code
19VoV3D-L (32frames, Kinetics pretrained, single)82.3YesDiverse Temporal Aggregation and Depthwise Spati...2020-12-01Code
20MSNet-R50 (16 frames, ImageNet pretrained)82.3YesMotionSqueeze: Neural Motion Feature Learning fo...2020-07-20Code
21RNL+TSM Ensemble(R50+R101, ImageNet pretrained)82.2NoRegion-based Non-local Operation for Video Class...2020-07-17Code
22RNL+TSM Ensemble(ResNet50, ImageNet pretrained)81.5NoRegion-based Non-local Operation for Video Class...2020-07-17Code
23TSM+W3 (16 frames, ResNet50)81.3NoKnowing What, Where and When to Look: Efficient ...2020-04-02-
24RSANet-R50 (16 frames, ImageNet pretrained, a single clip)81.1NoRelational Self-Attention: What's Missing in Att...2021-11-02Code
25VoV3D-M (32frames, Kinetics pretrained, single)80.43YesDiverse Temporal Aggregation and Depthwise Spati...2020-12-01Code
26MSNet-R50 (8 frames, ImageNet pretrained)80.3NoMotionSqueeze: Neural Motion Feature Learning fo...2020-07-20Code
27RSANet-R50 (8 frames, ImageNet pretrained, a single clip)79.6NoRelational Self-Attention: What's Missing in Att...2021-11-02Code
28VoV3D-L (32frames, from scratch, single)78.7NoDiverse Temporal Aggregation and Depthwise Spati...2020-12-01Code
29S3D-G (ImageNet pretrained)78.7YesRethinking Spatiotemporal Feature Learning: Spee...2017-12-13Code
30TSMEn78.5NoTSM: Temporal Shift Module for Efficient Video U...2018-11-20Code
31S3D78.1NoRethinking Spatiotemporal Feature Learning: Spee...2017-12-13Code
32VoV3D-M (32frames, from scratch, single)78NoDiverse Temporal Aggregation and Depthwise Spati...2020-12-01Code
33VoV3D-L (16frames, from scratch, single)78NoDiverse Temporal Aggregation and Depthwise Spati...2020-12-01Code
34TSM77.1NoTSM: Temporal Shift Module for Efficient Video U...2018-11-20Code
35VoV3D-M (16frames, from scratch, single)76.9NoDiverse Temporal Aggregation and Depthwise Spati...2020-12-01Code