Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Video
/
MiT
Video on MiT
Metric: Top 1 Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
Top 1 Accuracy (best first)
Top 1 Accuracy (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Top 1 Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
OmniVec2
53.1
Yes
-
-
-
2
InternVideo2-1B
50.9
Yes
InternVideo2: Scaling Foundation Models for Mult...
2024-03-22
Code
3
UMT-L (ViT-L/16)
48.7
Yes
Unmasked Teacher: Towards Training-Efficient Vid...
2023-03-28
Code
4
UniFormerV2-L
47.8
Yes
-
-
Code
5
MTV-H (WTS 60M)
47.2
Yes
Multiview Transformers for Video Recognition
2022-01-12
Code
6
CoVeR(JFT-3B)
46.1
Yes
Co-training Transformer with Videos and Images I...
2021-12-14
-
7
CoVeR(JFT-300M)
45
Yes
Co-training Transformer with Videos and Images I...
2021-12-14
-
8
VATT-Large
41.1
Yes
VATT: Transformers for Multimodal Self-Supervise...
2021-04-22
Code
9
MoViNet-A6
40.2
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
10
MoViNet-A5
39.1
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
11
MoViNet-A4
37.9
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
12
VTN
37.4
Yes
Video Transformer Network
2021-02-01
Code
13
MBT (AV)
37.3
No
Attention Bottlenecks for Multimodal Fusion
2021-06-30
Code
14
MoViNet-A3
35.6
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
15
MoViNet-A2
34.3
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
16
SRTG r3d-101
33.56
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
17
MoViNet-A1
32
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
18
SRTG r(2+1)d-50
31.6
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
19
SRTG r3d-50
30.72
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
20
SRTG r(2+1)d-34
28.97
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
21
SRTG r3d-34
28.55
No
Learn to cycle: Time-consistent feature discover...
2020-06-15
Code
22
TRN-Multiscale
28.27
No
Moments in Time Dataset: one million videos for ...
2018-01-09
Code
23
MoViNet-A0
27.5
No
MoViNets: Mobile Video Networks for Efficient Vi...
2021-03-21
Code
#1
OmniVec2
53.1
Top 1 Accuracy
· Extra Data
No paper
#2
InternVideo2-1B
SOTA
50.9
Top 1 Accuracy
· Extra Data
· 2024-03-22
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
Code
#3
UMT-L (ViT-L/16)
SOTA
48.7
Top 1 Accuracy
· Extra Data
· 2023-03-28
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Code
#4
UniFormerV2-L
47.8
Top 1 Accuracy
· Extra Data
No paper
Code
#5
MTV-H (WTS 60M)
SOTA
47.2
Top 1 Accuracy
· Extra Data
· 2022-01-12
Multiview Transformers for Video Recognition
Code
#6
CoVeR(JFT-3B)
SOTA
46.1
Top 1 Accuracy
· Extra Data
· 2021-12-14
Co-training Transformer with Videos and Images Improves Action Recognition
#7
CoVeR(JFT-300M)
45
Top 1 Accuracy
· Extra Data
· 2021-12-14
Co-training Transformer with Videos and Images Improves Action Recognition
#8
VATT-Large
SOTA
41.1
Top 1 Accuracy
· Extra Data
· 2021-04-22
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Code
#9
MoViNet-A6
SOTA
40.2
Top 1 Accuracy
· 2021-03-21
MoViNets: Mobile Video Networks for Efficient Video Recognition
Code
#10
MoViNet-A5
39.1
Top 1 Accuracy
· 2021-03-21
MoViNets: Mobile Video Networks for Efficient Video Recognition
Code
#11
MoViNet-A4
37.9
Top 1 Accuracy
· 2021-03-21
MoViNets: Mobile Video Networks for Efficient Video Recognition
Code
#12
VTN
SOTA
37.4
Top 1 Accuracy
· Extra Data
· 2021-02-01
Video Transformer Network
Code
#13
MBT (AV)
37.3
Top 1 Accuracy
· 2021-06-30
Attention Bottlenecks for Multimodal Fusion
Code
#14
MoViNet-A3
35.6
Top 1 Accuracy
· 2021-03-21
MoViNets: Mobile Video Networks for Efficient Video Recognition
Code
#15
MoViNet-A2
34.3
Top 1 Accuracy
· 2021-03-21
MoViNets: Mobile Video Networks for Efficient Video Recognition
Code
#16
SRTG r3d-101
SOTA
33.56
Top 1 Accuracy
· 2020-06-15
Learn to cycle: Time-consistent feature discovery for action recognition
Code
#17
MoViNet-A1
32
Top 1 Accuracy
· 2021-03-21
MoViNets: Mobile Video Networks for Efficient Video Recognition
Code
#18
SRTG r(2+1)d-50
31.6
Top 1 Accuracy
· 2020-06-15
Learn to cycle: Time-consistent feature discovery for action recognition
Code
#19
SRTG r3d-50
30.72
Top 1 Accuracy
· 2020-06-15
Learn to cycle: Time-consistent feature discovery for action recognition
Code
#20
SRTG r(2+1)d-34
28.97
Top 1 Accuracy
· 2020-06-15
Learn to cycle: Time-consistent feature discovery for action recognition
Code
#21
SRTG r3d-34
28.55
Top 1 Accuracy
· 2020-06-15
Learn to cycle: Time-consistent feature discovery for action recognition
Code
#22
TRN-Multiscale
SOTA
28.27
Top 1 Accuracy
· 2018-01-09
Moments in Time Dataset: one million videos for event understanding
Code
#23
MoViNet-A0
27.5
Top 1 Accuracy
· 2021-03-21
MoViNets: Mobile Video Networks for Efficient Video Recognition
Code