Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Zero-Shot Action Recognition
/
HMDB51
Zero-Shot Action Recognition on HMDB51
Metric: Top-1 Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
Top-1 Accuracy (best first)
Top-1 Accuracy (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Top-1 Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
MOV (ViT-L/14)
64.7
No
Multimodal Open-Vocabulary Video Classification ...
2022-07-15
-
2
OTI(ViT-L/14)
64
No
Orthogonal Temporal Interpolation for Zero-Shot ...
2023-08-14
Code
3
BIKE
61.4
No
Bidirectional Cross-Modal Knowledge Exploration ...
2022-12-31
Code
4
MOV (ViT-B/16)
60.8
No
Multimodal Open-Vocabulary Video Classification ...
2022-07-15
-
5
IMP-MoE-L
59.1
Yes
Alternating Gradient Descent and Mixture-of-Expe...
2023-05-10
-
6
VideoCoCa
58.7
Yes
VideoCoCa: Video-Text Modeling with Zero-Shot Tr...
2022-12-09
-
7
Text4Vis
58.4
No
Revisiting Classifier: Transferring Vision-Langu...
2022-07-04
Code
8
TC-CLIP
56
No
Leveraging Temporal Contextualization for Video ...
2024-04-15
Code
9
OST
55.9
No
OST: Refining Text Knowledge with Optimal Spatio...
2023-11-30
Code
10
MAXI
52.3
No
MAtch, eXpand and Improve: Unsupervised Finetuni...
2023-03-15
Code
11
VicTR (ViT-B/16)
51
No
VicTR: Video-conditioned Text Representations fo...
2023-04-05
-
12
LoCATe-GAT
50.7
No
-
-
Code
13
X-CLIP
44.6
No
Expanding Language-Image Pretrained Models for G...
2022-08-04
Code
14
CLASTER
43.2
No
CLASTER: Clustering with Reinforcement Learning ...
2021-01-18
-
15
ResT
41.1
No
Cross-modal Representation Learning for Zero-sho...
2022-05-03
-
16
AURL
39
No
Alignment-Uniformity aware Representation Learni...
2022-03-29
Code
17
JigsawNet
38.7
No
-
-
Code
18
SPOT
35.9
No
Synthetic Sample Selection for Generalized Zero-...
2023-04-06
-
19
ER-ZSAR
35.3
No
Elaborative Rehearsal for Zero-shot Action Recog...
2021-08-05
Code
20
E2E
32.7
No
Rethinking Zero-shot Video Classification: End-t...
2020-03-03
Code
21
UR
24.4
No
Towards Universal Representation for Unseen Acti...
2018-03-22
-
22
TS-GCN
23.2
No
-
-
Code
23
ZSECOC
22.6
No
-
-
-
24
ASR
21.8
No
Alternative Semantic Representations for Zero-Sh...
2017-06-28
-
25
MTE
19.7
No
Multi-Task Zero-Shot Action Recognition with Pri...
2016-11-26
-
26
ESZSL
18.5
No
-
-
-
27
O2A
15.6
No
Objects2action: Classifying and localizing actio...
2015-10-23
-
28
SJE(word embedding)
13.3
No
Evaluation of Output Embeddings for Fine-Grained...
2014-09-30
Code
#1
MOV (ViT-L/14)
SOTA
64.7
Top-1 Accuracy
· 2022-07-15
Multimodal Open-Vocabulary Video Classification via Pre-Trained Vision and Language Models
#2
OTI(ViT-L/14)
64
Top-1 Accuracy
· 2023-08-14
Orthogonal Temporal Interpolation for Zero-Shot Video Recognition
Code
#3
BIKE
61.4
Top-1 Accuracy
· 2022-12-31
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Code
#4
MOV (ViT-B/16)
60.8
Top-1 Accuracy
· 2022-07-15
Multimodal Open-Vocabulary Video Classification via Pre-Trained Vision and Language Models
#5
IMP-MoE-L
59.1
Top-1 Accuracy
· Extra Data
· 2023-05-10
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception
#6
VideoCoCa
58.7
Top-1 Accuracy
· Extra Data
· 2022-12-09
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners
#7
Text4Vis
SOTA
58.4
Top-1 Accuracy
· 2022-07-04
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Code
#8
TC-CLIP
56
Top-1 Accuracy
· 2024-04-15
Leveraging Temporal Contextualization for Video Action Recognition
Code
#9
OST
55.9
Top-1 Accuracy
· 2023-11-30
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
Code
#10
MAXI
52.3
Top-1 Accuracy
· 2023-03-15
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
Code
#11
VicTR (ViT-B/16)
51
Top-1 Accuracy
· 2023-04-05
VicTR: Video-conditioned Text Representations for Activity Recognition
#12
LoCATe-GAT
50.7
Top-1 Accuracy
No paper
Code
#13
X-CLIP
44.6
Top-1 Accuracy
· 2022-08-04
Expanding Language-Image Pretrained Models for General Video Recognition
Code
#14
CLASTER
SOTA
43.2
Top-1 Accuracy
· 2021-01-18
CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition
#15
ResT
41.1
Top-1 Accuracy
· 2022-05-03
Cross-modal Representation Learning for Zero-shot Action Recognition
#16
AURL
39
Top-1 Accuracy
· 2022-03-29
Alignment-Uniformity aware Representation Learning for Zero-shot Video Classification
Code
#17
JigsawNet
38.7
Top-1 Accuracy
No paper
Code
#18
SPOT
35.9
Top-1 Accuracy
· 2023-04-06
Synthetic Sample Selection for Generalized Zero-Shot Learning
#19
ER-ZSAR
35.3
Top-1 Accuracy
· 2021-08-05
Elaborative Rehearsal for Zero-shot Action Recognition
Code
#20
E2E
SOTA
32.7
Top-1 Accuracy
· 2020-03-03
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
Code
#21
UR
SOTA
24.4
Top-1 Accuracy
· 2018-03-22
Towards Universal Representation for Unseen Action Recognition
#22
TS-GCN
23.2
Top-1 Accuracy
No paper
Code
#23
ZSECOC
22.6
Top-1 Accuracy
No paper
#24
ASR
SOTA
21.8
Top-1 Accuracy
· 2017-06-28
Alternative Semantic Representations for Zero-Shot Human Action Recognition
#25
MTE
SOTA
19.7
Top-1 Accuracy
· 2016-11-26
Multi-Task Zero-Shot Action Recognition with Prioritised Data Augmentation
#26
ESZSL
18.5
Top-1 Accuracy
No paper
#27
O2A
SOTA
15.6
Top-1 Accuracy
· 2015-10-23
Objects2action: Classifying and localizing actions without any video example
#28
SJE(word embedding)
SOTA
13.3
Top-1 Accuracy
· 2014-09-30
Evaluation of Output Embeddings for Fine-Grained Image Classification
Code