Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Zero-Shot Action Recognition
/
Kinetics
Zero-Shot Action Recognition on Kinetics
Metric: Top-1 Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
Top-1 Accuracy (best first)
Top-1 Accuracy (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Top-1 Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
TC-CLIP
78.1
No
Leveraging Temporal Contextualization for Video ...
2024-04-15
Code
2
IMP-MoE-L
76.8
Yes
Alternating Gradient Descent and Mixture-of-Expe...
2023-05-10
-
3
OST
75.1
No
OST: Refining Text Knowledge with Optimal Spatio...
2023-11-30
Code
4
MAXI
71.6
No
MAtch, eXpand and Improve: Unsupervised Finetuni...
2023-03-15
Code
5
OTI(ViT-L/14)
70.6
No
Orthogonal Temporal Interpolation for Zero-Shot ...
2023-08-14
Code
6
VideoCoCa
70.1
Yes
VideoCoCa: Video-Text Modeling with Zero-Shot Tr...
2022-12-09
-
7
Text4Vis
68.9
No
Revisiting Classifier: Transferring Vision-Langu...
2022-07-04
Code
8
BIKE
68.5
No
Bidirectional Cross-Modal Knowledge Exploration ...
2022-12-31
Code
9
X-CLIP
65.2
No
Expanding Language-Image Pretrained Models for G...
2022-08-04
Code
10
LanguageBind
64.1
Yes
LanguageBind: Extending Video-Language Pretraini...
2023-10-03
Code
11
LoCATe-GAT
58.7
No
-
-
Code
12
JigsawNet
45.9
No
-
-
Code
13
ER-ZSAR (ST+Obj)
42.1
No
Elaborative Rehearsal for Zero-shot Action Recog...
2021-08-05
Code
14
ER-ZSAR (ST)
37.1
No
Elaborative Rehearsal for Zero-shot Action Recog...
2021-08-05
Code
15
DEVISE
23.8
No
-
-
-
16
DEM
23.6
No
Learning a Deep Embedding Model for Zero-Shot Le...
2016-11-15
Code
17
ALE
23.4
No
Label-Embedding for Image Classification
2015-03-30
Code
18
ESZSL
22.9
No
-
-
Code
19
GCN
22.3
No
All About Knowledge Graphs for Actions
2020-08-28
-
20
SJE(Word Embedding)
22.3
No
Evaluation of Output Embeddings for Fine-Grained...
2014-09-30
Code
#1
TC-CLIP
SOTA
78.1
Top-1 Accuracy
· 2024-04-15
Leveraging Temporal Contextualization for Video Action Recognition
Code
#2
IMP-MoE-L
SOTA
76.8
Top-1 Accuracy
· Extra Data
· 2023-05-10
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception
#3
OST
75.1
Top-1 Accuracy
· 2023-11-30
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
Code
#4
MAXI
SOTA
71.6
Top-1 Accuracy
· 2023-03-15
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
Code
#5
OTI(ViT-L/14)
70.6
Top-1 Accuracy
· 2023-08-14
Orthogonal Temporal Interpolation for Zero-Shot Video Recognition
Code
#6
VideoCoCa
SOTA
70.1
Top-1 Accuracy
· Extra Data
· 2022-12-09
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners
#7
Text4Vis
SOTA
68.9
Top-1 Accuracy
· 2022-07-04
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Code
#8
BIKE
68.5
Top-1 Accuracy
· 2022-12-31
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Code
#9
X-CLIP
65.2
Top-1 Accuracy
· 2022-08-04
Expanding Language-Image Pretrained Models for General Video Recognition
Code
#10
LanguageBind
64.1
Top-1 Accuracy
· Extra Data
· 2023-10-03
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Code
#11
LoCATe-GAT
58.7
Top-1 Accuracy
No paper
Code
#12
JigsawNet
45.9
Top-1 Accuracy
No paper
Code
#13
ER-ZSAR (ST+Obj)
SOTA
42.1
Top-1 Accuracy
· 2021-08-05
Elaborative Rehearsal for Zero-shot Action Recognition
Code
#14
ER-ZSAR (ST)
37.1
Top-1 Accuracy
· 2021-08-05
Elaborative Rehearsal for Zero-shot Action Recognition
Code
#15
DEVISE
23.8
Top-1 Accuracy
No paper
#16
DEM
SOTA
23.6
Top-1 Accuracy
· 2016-11-15
Learning a Deep Embedding Model for Zero-Shot Learning
Code
#17
ALE
SOTA
23.4
Top-1 Accuracy
· 2015-03-30
Label-Embedding for Image Classification
Code
#18
ESZSL
22.9
Top-1 Accuracy
No paper
Code
#19
GCN
22.3
Top-1 Accuracy
· 2020-08-28
All About Knowledge Graphs for Actions
#20
SJE(Word Embedding)
SOTA
22.3
Top-1 Accuracy
· 2014-09-30
Evaluation of Output Embeddings for Fine-Grained Image Classification
Code