Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Human-Object Interaction Detection
/
HICO-DET
Human-Object Interaction Detection on HICO-DET
Metric: mAP (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
mAP (best first)
mAP (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
mAP
▼
Extra Data
Paper
Date
↕
Code
1
Ours (PViC+)
46.49
No
Dynamic Scene Understanding from Vision-Language...
2025-01-20
-
2
RLIPv2 (Swin-L)
45.09
Yes
RLIPv2: Fast Scaling of Relational Language-Imag...
2023-08-18
Code
3
PViC-SwinL
44.32
No
Exploring Predicate Visual Context in Detecting ...
2023-08-11
Code
4
SOV-STG (Swin-L)
43.35
No
Focusing on what to decode and what to train: SO...
2023-07-05
Code
5
DiffHOI
41.5
Yes
Boosting Human-Object Interaction Detection with...
2023-05-20
Code
6
ViPLO
37.22
No
ViPLO: Vision Transformer based Pose-Conditioned...
2023-04-17
Code
7
FGAHOI
37.18
No
FGAHOI: Fine-Grained Anchors for Human-Object In...
2023-01-08
Code
8
ERNet
36.89
No
-
-
Code
9
CQL+GEN-VLKT-L
36.03
No
Category Query Learning for Human-Object Interac...
2023-03-24
Code
10
QAHOI (Swin-L)
35.78
No
QAHOI: Query-Based Anchors for Human-Object Inte...
2021-12-16
Code
11
CQL+GEN-VLKT-B
35.36
No
Category Query Learning for Human-Object Interac...
2023-03-24
Code
12
Body Part Interactiveness
35.15
No
Mining Cross-Person Cues for Body-Part Interacti...
2022-07-28
Code
13
GEN-VLKT-R101
34.95
No
GEN-VLKT: Simplify Association and Enhance Inter...
2022-03-26
Code
14
HOIGen
34.84
No
Unseen No More: Unlocking the Potential of CLIP ...
2024-08-12
Code
15
PViC-R50
34.69
No
Exploring Predicate Visual Context in Detecting ...
2023-08-11
Code
16
HOICLIP
34.69
No
HOICLIP: Efficient Knowledge Transfer for HOI De...
2023-03-28
Code
17
MUREN
32.87
No
Relational Context Learning for Human-Object Int...
2023-04-11
Code
18
RLIP-ParSe (ResNet-50)
32.84
No
RLIP: Relational Language-Image Pre-training for...
2022-09-05
Code
19
ParSe (ResNet-101)
32.76
No
RLIP: Relational Language-Image Pre-training for...
2022-09-05
Code
20
UPT-R101-DC5
32.62
No
Efficient Two-Stage Detection of Human-Object In...
2021-12-03
Code
21
DEFR
32.35
No
The Overlooked Classifier in Human-Object Intera...
2021-12-13
-
22
UPT-R101
32.31
No
Efficient Two-Stage Detection of Human-Object In...
2021-12-03
Code
23
STIP (ResNet-50)
32.22
No
Exploring Structure-aware Transformer over Inter...
2022-06-13
Code
24
CDN (ResNet101)
32.07
No
Mining the Benefits of Two-stage and One-stage H...
2021-08-11
Code
25
UPT-R50
31.66
No
Efficient Two-Stage Detection of Human-Object In...
2021-12-03
Code
26
OCN (ResNet101)
31.43
No
Detecting Human-Object Interactions with Object-...
2022-02-01
Code
27
QPIC (ResNet101)
29.9
Yes
QPIC: Query-Based Pairwise Human-Object Interact...
2021-03-09
Code
28
QPIC + CPC
29.63
No
Consistency Learning via Decoding Path Augmentat...
2022-04-11
Code
29
SCG (DETR-R101)
29.26
No
Spatially Conditioned Graphs for Detecting Human...
2020-12-11
Code
30
QPIC (ResNet50)
29.07
Yes
QPIC: Query-Based Pairwise Human-Object Interact...
2021-03-09
Code
31
AS-Net (ResNet50)
28.87
Yes
Reformulating HOI Detection as Adaptive Set Pred...
2021-03-10
Code
32
HOITrans(ResNet101)
26.61
Yes
End-to-End Human Object Interaction Detection wi...
2021-03-08
Code
33
IDN (finetuned detector)
26.29
Yes
HOI Analysis: Integrating and Decomposing Human-...
2020-10-30
Code
34
HOTR + CPC
26.16
No
Consistency Learning via Decoding Path Augmentat...
2022-04-11
Code
35
ConsNet-F (ResNet-50)
25.94
Yes
ConsNet: Learning Consistency Graph for Zero-Sho...
2020-08-14
Code
36
DRG
24.53
No
DRG: Dual Relation Graph for Human-Object Intera...
2020-08-26
Code
37
HOITrans(ResNet50)
23.46
Yes
End-to-End Human Object Interaction Detection wi...
2021-03-08
Code
38
HOTR
23.46
No
HOTR: End-to-End Human-Object Interaction Detect...
2021-04-28
Code
39
IDN (COCO detector)
23.36
No
HOI Analysis: Integrating and Decomposing Human-...
2020-10-30
Code
40
PaStaNet
22.65
No
PaStaNet: Toward Human Activity Knowledge Engine
2020-04-02
Code
41
PD-Net
22.37
No
Polysemy Deciphering Network for Robust Human-Ob...
2020-08-07
Code
42
ConsNet (ResNet-50)
22.15
No
ConsNet: Learning Consistency Graph for Zero-Sho...
2020-08-14
Code
43
ACP++
22.11
No
ACP++: Action Co-occurrence Priors for Human-Obj...
2021-09-09
Code
44
PPDM
21.92
No
PPDM: Parallel Point Detection and Matching for ...
2019-12-30
Code
45
DIRV
21.81
No
DIRV: Dense Interaction Region Voting for End-to...
2020-10-02
Code
46
DJ-RN
21.34
No
Detailed 2D-3D Joint Representation for Human-Ob...
2020-04-17
Code
47
PMN
21.21
No
Pose-based Modular Network for Human-Object Inte...
2020-08-05
Code
48
TIN (TIPAMI)
20.93
No
Transferable Interactiveness Knowledge for Human...
2021-01-25
Code
49
ACP
20.59
No
-
-
Code
50
VSGNet
19.8
No
VSGNet: Spatial Attention Network for Detecting ...
2020-03-11
Code
51
TIN (Interactiveness)
17.54
No
Transferable Interactiveness Knowledge for Human...
2018-11-20
Code
52
TIN (CVPR)
17.22
No
Transferable Interactiveness Knowledge for Human...
2018-11-20
Code
53
iCAN
14.84
No
iCAN: Instance-Centric Attention Network for Hum...
2018-08-30
Code
54
GPNN
13.11
No
Learning Human-Object Interactions by Graph Pars...
2018-08-23
Code
55
InteractNet
9.94
No
Detecting and Recognizing Human-Object Interacti...
2017-04-24
Code
#1
Ours (PViC+)
SOTA
46.49
mAP
· 2025-01-20
Dynamic Scene Understanding from Vision-Language Representations
#2
RLIPv2 (Swin-L)
SOTA
45.09
mAP
· Extra Data
· 2023-08-18
RLIPv2: Fast Scaling of Relational Language-Image Pre-training
Code
#3
PViC-SwinL
SOTA
44.32
mAP
· 2023-08-11
Exploring Predicate Visual Context in Detecting Human-Object Interactions
Code
#4
SOV-STG (Swin-L)
SOTA
43.35
mAP
· 2023-07-05
Focusing on what to decode and what to train: SOV Decoding with Specific Target Guided DeNoising and Vision Language Advisor
Code
#5
DiffHOI
SOTA
41.5
mAP
· Extra Data
· 2023-05-20
Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model
Code
#6
ViPLO
SOTA
37.22
mAP
· 2023-04-17
ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction Detection
Code
#7
FGAHOI
SOTA
37.18
mAP
· 2023-01-08
FGAHOI: Fine-Grained Anchors for Human-Object Interaction Detection
Code
#8
ERNet
36.89
mAP
No paper
Code
#9
CQL+GEN-VLKT-L
36.03
mAP
· 2023-03-24
Category Query Learning for Human-Object Interaction Classification
Code
#10
QAHOI (Swin-L)
SOTA
35.78
mAP
· 2021-12-16
QAHOI: Query-Based Anchors for Human-Object Interaction Detection
Code
#11
CQL+GEN-VLKT-B
35.36
mAP
· 2023-03-24
Category Query Learning for Human-Object Interaction Classification
Code
#12
Body Part Interactiveness
35.15
mAP
· 2022-07-28
Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection
Code
#13
GEN-VLKT-R101
34.95
mAP
· 2022-03-26
GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection
Code
#14
HOIGen
34.84
mAP
· 2024-08-12
Unseen No More: Unlocking the Potential of CLIP for Generative Zero-shot HOI Detection
Code
#15
PViC-R50
34.69
mAP
· 2023-08-11
Exploring Predicate Visual Context in Detecting Human-Object Interactions
Code
#16
HOICLIP
34.69
mAP
· 2023-03-28
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Code
#17
MUREN
32.87
mAP
· 2023-04-11
Relational Context Learning for Human-Object Interaction Detection
Code
#18
RLIP-ParSe (ResNet-50)
32.84
mAP
· 2022-09-05
RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection
Code
#19
ParSe (ResNet-101)
32.76
mAP
· 2022-09-05
RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection
Code
#20
UPT-R101-DC5
SOTA
32.62
mAP
· 2021-12-03
Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer
Code
#21
DEFR
32.35
mAP
· 2021-12-13
The Overlooked Classifier in Human-Object Interaction Recognition
#22
UPT-R101
32.31
mAP
· 2021-12-03
Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer
Code
#23
STIP (ResNet-50)
32.22
mAP
· 2022-06-13
Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection
Code
#24
CDN (ResNet101)
SOTA
32.07
mAP
· 2021-08-11
Mining the Benefits of Two-stage and One-stage HOI Detection
Code
#25
UPT-R50
31.66
mAP
· 2021-12-03
Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer
Code
#26
OCN (ResNet101)
31.43
mAP
· 2022-02-01
Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics
Code
#27
QPIC (ResNet101)
SOTA
29.9
mAP
· Extra Data
· 2021-03-09
QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information
Code
#28
QPIC + CPC
29.63
mAP
· 2022-04-11
Consistency Learning via Decoding Path Augmentation for Transformers in Human Object Interaction Detection
Code
#29
SCG (DETR-R101)
SOTA
29.26
mAP
· 2020-12-11
Spatially Conditioned Graphs for Detecting Human-Object Interactions
Code
#30
QPIC (ResNet50)
29.07
mAP
· Extra Data
· 2021-03-09
QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information
Code
#31
AS-Net (ResNet50)
28.87
mAP
· Extra Data
· 2021-03-10
Reformulating HOI Detection as Adaptive Set Prediction
Code
#32
HOITrans(ResNet101)
26.61
mAP
· Extra Data
· 2021-03-08
End-to-End Human Object Interaction Detection with HOI Transformer
Code
#33
IDN (finetuned detector)
SOTA
26.29
mAP
· Extra Data
· 2020-10-30
HOI Analysis: Integrating and Decomposing Human-Object Interaction
Code
#34
HOTR + CPC
26.16
mAP
· 2022-04-11
Consistency Learning via Decoding Path Augmentation for Transformers in Human Object Interaction Detection
Code
#35
ConsNet-F (ResNet-50)
SOTA
25.94
mAP
· Extra Data
· 2020-08-14
ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection
Code
#36
DRG
24.53
mAP
· 2020-08-26
DRG: Dual Relation Graph for Human-Object Interaction Detection
Code
#37
HOITrans(ResNet50)
23.46
mAP
· Extra Data
· 2021-03-08
End-to-End Human Object Interaction Detection with HOI Transformer
Code
#38
HOTR
23.46
mAP
· 2021-04-28
HOTR: End-to-End Human-Object Interaction Detection with Transformers
Code
#39
IDN (COCO detector)
23.36
mAP
· 2020-10-30
HOI Analysis: Integrating and Decomposing Human-Object Interaction
Code
#40
PaStaNet
SOTA
22.65
mAP
· 2020-04-02
PaStaNet: Toward Human Activity Knowledge Engine
Code
#41
PD-Net
22.37
mAP
· 2020-08-07
Polysemy Deciphering Network for Robust Human-Object Interaction Detection
Code
#42
ConsNet (ResNet-50)
22.15
mAP
· 2020-08-14
ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection
Code
#43
ACP++
22.11
mAP
· 2021-09-09
ACP++: Action Co-occurrence Priors for Human-Object Interaction Detection
Code
#44
PPDM
SOTA
21.92
mAP
· 2019-12-30
PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection
Code
#45
DIRV
21.81
mAP
· 2020-10-02
DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection
Code
#46
DJ-RN
21.34
mAP
· 2020-04-17
Detailed 2D-3D Joint Representation for Human-Object Interaction
Code
#47
PMN
21.21
mAP
· 2020-08-05
Pose-based Modular Network for Human-Object Interaction Detection
Code
#48
TIN (TIPAMI)
20.93
mAP
· 2021-01-25
Transferable Interactiveness Knowledge for Human-Object Interaction Detection
Code
#49
ACP
20.59
mAP
No paper
Code
#50
VSGNet
19.8
mAP
· 2020-03-11
VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions
Code
#51
TIN (Interactiveness)
SOTA
17.54
mAP
· 2018-11-20
Transferable Interactiveness Knowledge for Human-Object Interaction Detection
Code
#52
TIN (CVPR)
17.22
mAP
· 2018-11-20
Transferable Interactiveness Knowledge for Human-Object Interaction Detection
Code
#53
iCAN
SOTA
14.84
mAP
· 2018-08-30
iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
Code
#54
GPNN
SOTA
13.11
mAP
· 2018-08-23
Learning Human-Object Interactions by Graph Parsing Neural Networks
Code
#55
InteractNet
SOTA
9.94
mAP
· 2017-04-24
Detecting and Recognizing Human-Object Interactions
Code