Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Video
/
Kinetics-700
Video on Kinetics-700
Metric: Top-5 Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
Top-5 Accuracy (best first)
Top-5 Accuracy (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Top-5 Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
UMT-L (ViT-L/16)
96.7
Yes
Unmasked Teacher: Towards Training-Efficient Vid...
2023-03-28
Code
2
TubeViT-L
96.6
No
Rethinking Video ViTs: Sparse Video Tubes for Jo...
2022-12-06
Code
3
MTV-H (WTS 60M)
96.2
Yes
Multiview Transformers for Video Recognition
2022-01-12
Code
4
UniFormerV2-L
96.2
Yes
-
-
Code
5
MaskFeat (no extra data, MViT-L)
95.7
No
Masked Feature Prediction for Self-Supervised Vi...
2021-12-16
Code
6
mPLUG-2
94.9
Yes
mPLUG-2: A Modularized Multi-modal Foundation Mo...
2023-02-01
Code
7
CoVeR (JFT-3B)
94.9
Yes
Co-training Transformer with Videos and Images I...
2021-12-14
-
8
MViTv2-L (ImageNet-21k pretrain)
94.9
Yes
MViTv2: Improved Multiscale Vision Transformers ...
2021-12-02
Code
9
CoVeR (JFT-300M)
94.2
Yes
Co-training Transformer with Videos and Images I...
2021-12-14
-
10
MViTv2-B
93.2
No
MViTv2: Improved Multiscale Vision Transformers ...
2021-12-02
Code
11
En-VidTr-L
89.4
No
VidTr: Video Transformer Without Convolutions
2021-04-23
-
12
VidTr-L
89
No
VidTr: Video Transformer Without Convolutions
2021-04-23
-
13
VidTr-M
88.3
No
VidTr: Video Transformer Without Convolutions
2021-04-23
-
14
VidTr-S
87.7
No
VidTr: Video Transformer Without Convolutions
2021-04-23
-
15
SRTG r3d-101
76.82
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
16
SRTG r(2+1)d-50
74.62
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
17
SRTG r3d-50
74.17
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
18
SRTG r(2+1)d-34
73.23
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
19
SRTG r3d-34
72.68
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
#1
UMT-L (ViT-L/16)
SOTA
96.7
Top-5 Accuracy
· Extra Data
· 2023-03-28
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Code
#2
TubeViT-L
SOTA
96.6
Top-5 Accuracy
· 2022-12-06
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
Code
#3
MTV-H (WTS 60M)
SOTA
96.2
Top-5 Accuracy
· Extra Data
· 2022-01-12
Multiview Transformers for Video Recognition
Code
#4
UniFormerV2-L
96.2
Top-5 Accuracy
· Extra Data
No paper
Code
#5
MaskFeat (no extra data, MViT-L)
SOTA
95.7
Top-5 Accuracy
· 2021-12-16
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Code
#6
mPLUG-2
94.9
Top-5 Accuracy
· Extra Data
· 2023-02-01
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Code
#7
CoVeR (JFT-3B)
94.9
Top-5 Accuracy
· Extra Data
· 2021-12-14
Co-training Transformer with Videos and Images Improves Action Recognition
#8
MViTv2-L (ImageNet-21k pretrain)
SOTA
94.9
Top-5 Accuracy
· Extra Data
· 2021-12-02
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Code
#9
CoVeR (JFT-300M)
94.2
Top-5 Accuracy
· Extra Data
· 2021-12-14
Co-training Transformer with Videos and Images Improves Action Recognition
#10
MViTv2-B
93.2
Top-5 Accuracy
· 2021-12-02
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Code
#11
En-VidTr-L
SOTA
89.4
Top-5 Accuracy
· 2021-04-23
VidTr: Video Transformer Without Convolutions
#12
VidTr-L
89
Top-5 Accuracy
· 2021-04-23
VidTr: Video Transformer Without Convolutions
#13
VidTr-M
88.3
Top-5 Accuracy
· 2021-04-23
VidTr: Video Transformer Without Convolutions
#14
VidTr-S
87.7
Top-5 Accuracy
· 2021-04-23
VidTr: Video Transformer Without Convolutions
#15
SRTG r3d-101
SOTA
76.82
Top-5 Accuracy
· 2020-06-15
Learn to cycle: Time-consistent feature discovery for action recognition
Code
#16
SRTG r(2+1)d-50
74.62
Top-5 Accuracy
· 2020-06-15
Learn to cycle: Time-consistent feature discovery for action recognition
Code
#17
SRTG r3d-50
74.17
Top-5 Accuracy
· 2020-06-15
Learn to cycle: Time-consistent feature discovery for action recognition
Code
#18
SRTG r(2+1)d-34
73.23
Top-5 Accuracy
· 2020-06-15
Learn to cycle: Time-consistent feature discovery for action recognition
Code
#19
SRTG r3d-34
72.68
Top-5 Accuracy
· 2020-06-15
Learn to cycle: Time-consistent feature discovery for action recognition
Code