Action Detection on J-HMDB

Metric: Frame-mAP 0.5 (higher is better)

LeaderboardDataset

Loading chart...

Results

Submit a result

Sort:

#	Model↕	Frame-mAP 0.5▼	Extra Data	Paper	Date↕	Code
1	SiA	88.5	No	Scaling Open-Vocabulary Action Detection	2025-04-04	Code
2	HIT	83.8	No	Holistic Interaction Transformer Network for Act...	2022-10-23	Code
3	HISAN (VGG-16)	76.72	No	-	-	-
4	YOWO + LFB	75.7	No	You Only Watch Once: A Unified CNN Architecture ...	2019-11-15	Code
5	YOWO	74.4	No	You Only Watch Once: A Unified CNN Architecture ...	2019-11-15	Code
6	MOC	74	No	Actions as Moving Points	2020-01-14	Code
7	Faster-RCNN + two-stream I3D conv	73.3	No	AVA: A Video Dataset of Spatio-temporally Locali...	2017-05-23	Code
8	TACNet	65.5	No	TACNet: Transition-Aware Context Network for Spa...	2019-05-31	-
9	T-CNN	61.3	No	Tube Convolutional Neural Network (T-CNN) for Ac...	2017-03-30	Code
10	MR-TS R-CNN	58.5	No	-	-	-
11	TS R-CNN	56.9	No	-	-	-
12	Actionness	39.9	No	Actionness Estimation Using Hybrid Fully Convolu...	2016-04-25	-
13	Action Tubes	36.2	No	Finding Action Tubes	2014-11-21	Code

#1SiASOTA
88.5
Frame-mAP 0.5· 2025-04-04
Scaling Open-Vocabulary Action Detection Code
#2HITSOTA
83.8
Frame-mAP 0.5· 2022-10-23
Holistic Interaction Transformer Network for Action Detection Code
#3HISAN (VGG-16)
76.72
Frame-mAP 0.5
No paper
#4YOWO + LFBSOTA
75.7
Frame-mAP 0.5· 2019-11-15
You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization Code
#5YOWO
74.4
Frame-mAP 0.5· 2019-11-15
You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization Code
#6MOC
74
Frame-mAP 0.5· 2020-01-14
Actions as Moving Points Code
#7Faster-RCNN + two-stream I3D convSOTA
73.3
Frame-mAP 0.5· 2017-05-23
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions Code
#8TACNet
65.5
Frame-mAP 0.5· 2019-05-31
TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection
#9T-CNNSOTA
61.3
Frame-mAP 0.5· 2017-03-30
Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos Code
#10MR-TS R-CNN
58.5
Frame-mAP 0.5
No paper
#11TS R-CNN
56.9
Frame-mAP 0.5
No paper
#12ActionnessSOTA
39.9
Frame-mAP 0.5· 2016-04-25
Actionness Estimation Using Hybrid Fully Convolutional Networks
#13Action TubesSOTA
36.2
Frame-mAP 0.5· 2014-11-21
Finding Action Tubes Code