TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Time Series/Action Recognition/HMDB-51

Action Recognition on HMDB-51

Metric: Average accuracy of 3 splits (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Average accuracy of 3 splits▼Extra DataPaperDate↕Code
1VideoMAE V2-g88.7YesVideoMAE V2: Scaling Video Masked Autoencoders w...2023-03-29Code
2DejaVid88.6Yes--Code
3DEEP-HAL with ODF+SDF(I3D)87.56YesSelf-supervising Action Recognition by Statistic...2020-01-14-
4TO+MaxExp+IDT87.21YesHigh-order Tensor Pooling with Attention for Act...2021-10-11-
5SCK⊕(I3D)+IDT86.11YesTensor Representations for Action Recognition2020-12-28Code
6SO+MaxExp+IDT85.7YesHigh-order Tensor Pooling with Attention for Act...2021-10-11-
7R2+1D-BERT85.1YesLate Temporal Modeling in 3D CNN Architectures w...2020-08-03Code
8Ours + ResNext101 BERT84.53NoPose And Joint-Aware Action Recognition2020-10-16Code
9SMART84.36NoSMART Frame Selection for Action Recognition2020-12-19-
10OmniSource (SlowOnly-8x8-R101-RGB + I3D Flow)83.8YesOmni-sourced Webly-supervised Learning for Video...2020-03-29Code
11ZeroI2V ViT-L/1483.4YesZeroI2V: Zero-Cost Adaptation of Pre-trained Tra...2023-10-02Code
12PERF-Net (distilled S3D-G)83.2NoPERF-Net: Pose Empowered RGB-Flow Net2020-09-28-
13BIKE83.1YesBidirectional Cross-Modal Knowledge Exploration ...2022-12-31Code
14BubbleNET82.6Yes---
15HAF+BoW/FV halluc82.48YesHallucinating IDT Descriptors and I3D Optical Fl...2019-06-13-
16CCS + TSN (ImageNet+Kinetics pretrained)81.9YesCooperative Cross-Stream Network for Discriminat...2019-08-27-
17RepFlow-50 ([2+1]D CNN, FcF, Non-local block)81.1NoRepresentation Flow for Action Recognition2018-10-02Code
18Multi-stream I3D 80.92No---
19MARS+RGB+FLow (64 frames, Kinetics pretrained)80.9Yes--Code
20Two-stream I3D80.9YesQuo Vadis, Action Recognition? A New Model and t...2017-05-22Code
21Two-Stream I3D (Imagenet+Kinetics pre-training)80.7YesQuo Vadis, Action Recognition? A New Model and t...2017-05-22Code
22LGD-3D Two-stream80.5NoLearning Spatio-Temporal Representation with Loc...2019-06-13-
23D3D + D3D80.5NoD3D: Distilled 3D Networks for Video Action Reco...2018-12-19Code
24AMD(ViT-B/16)79.6YesAsymmetric Masked Distillation for Pre-Training ...2023-11-06-
25D3D (Kinetics-600 pretraining)79.3NoD3D: Distilled 3D Networks for Video Action Reco...2018-12-19Code
26LGD-3D Flow78.9NoLearning Spatio-Temporal Representation with Loc...2019-06-13-
27Hidden Two-Stream78.7NoHidden Two-Stream Convolutional Networks for Act...2017-04-02Code
28R[2+1]D-TwoStream (Kinetics pretrained)78.7YesA Closer Look at Spatiotemporal Convolutions for...2017-11-30Code
29D3D (Kinetics-400 pretraining)78.7NoD3D: Distilled 3D Networks for Video Action Reco...2018-12-19Code
30I3D RGB + DMC-Net (I3D)77.8NoDMC-Net: Generating Discriminative Motion Cues f...2019-01-11-
31BQN77.6NoBusy-Quiet Video Disentangling for Video Classif...2021-03-29Code
32MSNet-R50 (16 frames, ImageNet pretrained)77.4NoMotionSqueeze: Neural Motion Feature Learning fo...2020-07-20Code
33Flow-I3D (Kinetics pre-training)77.3YesQuo Vadis, Action Recognition? A New Model and t...2017-05-22Code
34Flow-I3D (Imagenet+Kinetics pre-training)77.1YesQuo Vadis, Action Recognition? A New Model and t...2017-05-22Code
35HATNet (32 frames)76.5NoLarge Scale Holistic Video Understanding2019-04-25Code
36R[2+1]D-Flow (Kinetics pretrained)76.4YesA Closer Look at Spatiotemporal Convolutions for...2017-11-30Code
37S3D-G (ImageNet, Kinetics-400 pretrained)75.9NoRethinking Spatiotemporal Feature Learning: Spee...2017-12-13Code
38FASTER32 (Kinetics pretrain)75.7YesFASTER Recurrent Networks for Efficient Video Cl...2019-06-10-
39LGD-3D RGB75.7NoLearning Spatio-Temporal Representation with Loc...2019-06-13-
40RGB-I3D (Imagenet+Kinetics pre-training)74.8YesQuo Vadis, Action Recognition? A New Model and t...2017-05-22Code
41R[2+1]D-RGB (Kinetics pretrained)74.5YesA Closer Look at Spatiotemporal Convolutions for...2017-11-30Code
42VidTr-L74.4NoVidTr: Video Transformer Without Convolutions2021-04-23-
43ADL+ResNet+IDT74.3NoContrastive Video Representation Learning via Ad...2018-07-24-
44RGB-I3D (Kinetics pre-training)74.3YesQuo Vadis, Action Recognition? A New Model and t...2017-05-22Code
45Optical Flow Guided Feature74.2NoOptical Flow Guided Feature: A Fast and Robust M...2017-11-29Code
46R[2+1D]D-TwoStream (Sports1M pretrained)72.7YesA Closer Look at Spatiotemporal Convolutions for...2017-11-30Code
47TVNet+IDT72.6NoEnd-to-End Learning of Motion Representation for...2018-04-02Code
48STM Network+IDT72.2No--Code
49STM (ImageNet+Kinetics pretrain)72.2NoSTM: SpatioTemporal and Motion Encoding for Acti...2019-08-07-
50Prob-Distill72NoAttention Distillation for Learning Video Repres...2019-04-05-
51DMC-Net (I3D)71.8NoDMC-Net: Generating Discriminative Motion Cues f...2019-01-11-
52TesNet (ImageNet pretrained)71.5NoLearning spatio-temporal representations with te...2020-02-11-
53HF-ECOLite (ImageNet+Kinetics pretrain)71.13YesHierarchical Feature Aggregation Networks for Vi...2019-05-29-
54ARTNet w/ TSN70.9NoAppearance-and-Relation Networks for Video Class...2017-11-24Code
55ST-ResNet + IDT70.3NoSpatiotemporal Residual Networks for Video Actio...2016-11-07Code
56R[2+1]D-Flow (Sports1M pretrained)70.1YesA Closer Look at Spatiotemporal Convolutions for...2017-11-30Code
57Temporal Segment Networks69.4NoTemporal Segment Networks: Towards Good Practice...2016-08-02Code
58TS-LSTM69NoTS-LSTM and Temporal-Inception: Exploiting Spati...2017-03-30Code
59SVT67.2NoSelf-supervised Video Transformer2021-12-02Code
60R[2+1]D-RGB (Sports1M pretrained)66.6YesA Closer Look at Spatiotemporal Convolutions for...2017-11-30Code
61TDD + IDT65.9NoAction Recognition with Trajectory-Pooled Deep-C...2015-05-19Code
62VIMPAC65.9NoVIMPAC: Video Pre-Training via Masked Token Pred...2021-06-21Code
63S:VGG-16, T:VGG-16 (ImageNet pretrained)65.4YesConvolutional Two-Stream Network Fusion for Vide...2016-04-22Code
64Dynamic Image Networks + IDT65.2No--Code
65LTC64.8NoLong-term Temporal Convolutions for Action Recog...2016-04-15Code
66R-STAN-5062.8No---
67DMC-Net (ResNet-18)62.8NoDMC-Net: Generating Discriminative Motion Cues f...2019-01-11-
68SUSiNet (multi, Kinetics pretrained)62.7YesSUSiNet: See, Understand and Summarize it2018-12-03-
69Two-Stream (ImageNet pretrained)59.4YesTwo-Stream Convolutional Networks for Action Rec...2014-06-09Code
70ActionFlowNet56.4NoActionFlowNet: Learning Motion Representation fo...2016-12-09-
71R-STAN-15255.16No---
72Res3D54.9NoConvNet Architecture Search for Spatiotemporal F...2017-08-16Code
73R(2+1)D-18 (DistInit pretraining)54.8YesDistInit: Learning Video Representations Without...2019-01-26-
74JRMN54.2NoPose And Joint-Aware Action Recognition2020-10-16Code
75CD-UAR51.8NoTowards Universal Representation for Unseen Acti...2018-03-22-
76C3D51.6NoLearning Spatiotemporal Features with 3D Convolu...2014-12-02Code
77R[2+1]D (VideoMoCo)49.2NoVideoMoCo: Contrastive Video Representation Lear...2021-03-10Code
783D-ResNet-18 (VideoMoCo)43.6NoVideoMoCo: Contrastive Video Representation Lear...2021-03-10Code