Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Video
/
Kinetics-700
Video on Kinetics-700
Metric: Top-1 Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
Top-1 Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
InternVideo2-6B
85.9
Yes
InternVideo2: Scaling Foundation Models for Mult...
2024-03-22
Code
2
InternVideo2-1B
85.4
Yes
InternVideo2: Scaling Foundation Models for Mult...
2024-03-22
Code
3
InternVideo-T
84
Yes
InternVideo: General Video Foundation Models via...
2022-12-06
Code
4
TubeViT-L
83.8
No
Rethinking Video ViTs: Sparse Video Tubes for Jo...
2022-12-06
Code
5
UMT-L (ViT-L/16)
83.6
Yes
Unmasked Teacher: Towards Training-Efficient Vid...
2023-03-28
Code
6
MTV-H (WTS 60M)
83.4
Yes
Multiview Transformers for Video Recognition
2022-01-12
Code
7
UniFormerV2-L
82.7
Yes
-
-
Code
8
CoCa (finetuned)
82.7
Yes
CoCa: Contrastive Captioners are Image-Text Foun...
2022-05-04
Code
9
CoCa (frozen)
81.1
Yes
CoCa: Contrastive Captioners are Image-Text Foun...
2022-05-04
Code
10
Hiera-H (no extra data)
81.1
No
Hiera: A Hierarchical Vision Transformer without...
2023-06-01
Code
11
MaskFeat (no extra data, MViT-L)
80.4
No
Masked Feature Prediction for Self-Supervised Vi...
2021-12-16
Code
12
mPLUG-2
80.4
Yes
mPLUG-2: A Modularized Multi-modal Foundation Mo...
2023-02-01
Code
13
AIM (CLIP ViT-L/14, 32x224)
80.4
Yes
AIM: Adapting Image Models for Efficient Video A...
2023-02-06
Code
14
CoVeR (JFT-3B)
79.8
Yes
Co-training Transformer with Videos and Images I...
2021-12-14
-
15
MViTv2-L (ImageNet-21k pretrain)
79.4
Yes
MViTv2: Improved Multiscale Vision Transformers ...
2021-12-02
Code
16
MoViNet-A6
79.4
No
MViTv2: Improved Multiscale Vision Transformers ...
2021-12-02
Code
17
CoVeR (JFT-300M)
78.5
Yes
Co-training Transformer with Videos and Images I...
2021-12-14
-
18
MViTv2-B
76.6
No
MViTv2: Improved Multiscale Vision Transformers ...
2021-12-02
Code
19
MoViNet-A6
72.3
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
20
MoViNet-A5
71.7
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
21
En-VidTr-L
70.8
No
VidTr: Video Transformer Without Convolutions
2021-04-23
-
22
MoViNet-A4
70.7
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
23
VidTr-L
70.2
No
VidTr: Video Transformer Without Convolutions
2021-04-23
-
24
VidTr-M
69.5
No
VidTr: Video Transformer Without Convolutions
2021-04-23
-
25
MoViNet-A3
68
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
26
VidTr-S
67.3
No
VidTr: Video Transformer Without Convolutions
2021-04-23
-
27
MoViNet-A2
66.7
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
28
MoViNet-A1
63.5
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
29
MoViNet-A0
58.5
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
30
SRTG r3d-101
56.46
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
31
SRTG r(2+1)d-50
54.17
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
32
SRTG r3d-50
53.52
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
33
SEER (RegNet10B)
51.9
Yes
Vision Models Are More Robust And Fair When Pret...
2022-02-16
Code
34
SRTG r(2+1)d-34
49.43
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
35
SRTG r3d-34
49.15
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code