Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Robots
/
Activity Recognition
/
UCF101 (finetuned)
Activity Recognition on UCF101 (finetuned)
Metric: 3-fold Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
3-fold Accuracy (best first)
3-fold Accuracy (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
3-fold Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
BraVe:V-FA (TSM-50x2)
95.7
No
Broaden Your Views for Self-Supervised Video Lea...
2021-03-30
Code
2
XDC
95.5
No
Self-Supervised Learning by Cross-Modal Audio-Vi...
2019-11-28
Code
3
CVRL (R3D-152 2x; K600)
93.9
No
Spatiotemporal Contrastive Video Representation ...
2020-08-09
Code
4
ELo
93.8
No
Evolving Losses for Unsupervised Video Represent...
2020-02-26
-
5
CVRL (R3D-50; K600)
93.4
No
Spatiotemporal Contrastive Video Representation ...
2020-08-09
Code
6
CVRL (R3D-50; K400)
92.2
No
Spatiotemporal Contrastive Video Representation ...
2020-08-09
Code
7
AVID
91.5
No
Audio-Visual Instance Discrimination with Cross-...
2020-04-27
Code
8
MMV
91.5
No
Self-Supervised MultiModal Versatile Networks
2020-06-29
Code
9
ViCC (S3D; R+F)
90.5
No
Self-supervised Video Representation Learning wi...
2021-06-18
Code
10
AVTS
89
No
Cooperative Learning of Audio and Video Models f...
2018-06-30
-
11
ViCC (R2+1D; R+F)
88.8
No
Self-supervised Video Representation Learning wi...
2021-06-18
Code
12
CoCLR
87.9
No
Self-supervised Co-training for Video Representa...
2020-10-19
Code
13
ViCC (S3D; RGB)
84.3
No
Self-supervised Video Representation Learning wi...
2021-06-18
Code
14
ViCC (R2+1D; RGB)
82.8
No
Self-supervised Video Representation Learning wi...
2021-06-18
Code
#1
BraVe:V-FA (TSM-50x2)
SOTA
95.7
3-fold Accuracy
· 2021-03-30
Broaden Your Views for Self-Supervised Video Learning
Code
#2
XDC
SOTA
95.5
3-fold Accuracy
· 2019-11-28
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Code
#3
CVRL (R3D-152 2x; K600)
93.9
3-fold Accuracy
· 2020-08-09
Spatiotemporal Contrastive Video Representation Learning
Code
#4
ELo
93.8
3-fold Accuracy
· 2020-02-26
Evolving Losses for Unsupervised Video Representation Learning
#5
CVRL (R3D-50; K600)
93.4
3-fold Accuracy
· 2020-08-09
Spatiotemporal Contrastive Video Representation Learning
Code
#6
CVRL (R3D-50; K400)
92.2
3-fold Accuracy
· 2020-08-09
Spatiotemporal Contrastive Video Representation Learning
Code
#7
AVID
91.5
3-fold Accuracy
· 2020-04-27
Audio-Visual Instance Discrimination with Cross-Modal Agreement
Code
#8
MMV
91.5
3-fold Accuracy
· 2020-06-29
Self-Supervised MultiModal Versatile Networks
Code
#9
ViCC (S3D; R+F)
90.5
3-fold Accuracy
· 2021-06-18
Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
Code
#10
AVTS
SOTA
89
3-fold Accuracy
· 2018-06-30
Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization
#11
ViCC (R2+1D; R+F)
88.8
3-fold Accuracy
· 2021-06-18
Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
Code
#12
CoCLR
87.9
3-fold Accuracy
· 2020-10-19
Self-supervised Co-training for Video Representation Learning
Code
#13
ViCC (S3D; RGB)
84.3
3-fold Accuracy
· 2021-06-18
Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
Code
#14
ViCC (R2+1D; RGB)
82.8
3-fold Accuracy
· 2021-06-18
Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
Code