TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Discovering Human-Object Interaction Concepts via Self-Com...

Discovering Human-Object Interaction Concepts via Self-Compositional Learning

Zhi Hou, Baosheng Yu, DaCheng Tao

2022-03-27Affordance RecognitionHuman-Object Interaction DetectionHuman-Object Interaction Concept Discovery
PaperPDFCode(official)Code(official)

Abstract

A comprehensive understanding of human-object interaction (HOI) requires detecting not only a small portion of predefined HOI concepts (or categories) but also other reasonable HOI concepts, while current approaches usually fail to explore a huge portion of unknown HOI concepts (i.e., unknown but reasonable combinations of verbs and objects). In this paper, 1) we introduce a novel and challenging task for a comprehensive HOI understanding, which is termed as HOI Concept Discovery; and 2) we devise a self-compositional learning framework (or SCL) for HOI concept discovery. Specifically, we maintain an online updated concept confidence matrix during training: 1) we assign pseudo-labels for all composite HOI instances according to the concept confidence matrix for self-training; and 2) we update the concept confidence matrix using the predictions of all composite HOI instances. Therefore, the proposed method enables the learning on both known and unknown HOI concepts. We perform extensive experiments on several popular HOI datasets to demonstrate the effectiveness of the proposed method for HOI concept discovery, object affordance recognition and HOI detection. For example, the proposed self-compositional learning framework significantly improves the performance of 1) HOI concept discovery by over 10% on HICO-DET and over 3% on V-COCO, respectively; 2) object affordance recognition by over 9% mAP on MS-COCO and HICO-DET; and 3) rare-first and non-rare-first unknown HOI detection relatively over 30% and 20%, respectively. Code is publicly available at https://github.com/zhihou7/HOI-CL.

Results

TaskDatasetMetricValueModel
Human-Object Interaction DetectionHICO-DET(Unknown Concepts)COCO-Val201756.19SCL
Human-Object Interaction DetectionHICO-DET(Unknown Concepts)HICO64.5SCL
Human-Object Interaction DetectionHICO-DET(Unknown Concepts)Novel Classes18.55SCL
Human-Object Interaction DetectionHICO-DET(Unknown Concepts)Obj36546.32SCL
Human-Object Interaction DetectionHICO-DETCOCO-Val201772.08SCL
Human-Object Interaction DetectionHICO-DETHICO82.47SCL
Human-Object Interaction DetectionHICO-DETNovel classes18.55SCL
Human-Object Interaction DetectionHICO-DETObject36557.53SCL
Affordance RecognitionHICO-DET(Unknown Concepts)COCO-Val201756.19SCL
Affordance RecognitionHICO-DET(Unknown Concepts)HICO64.5SCL
Affordance RecognitionHICO-DET(Unknown Concepts)Novel Classes18.55SCL
Affordance RecognitionHICO-DET(Unknown Concepts)Obj36546.32SCL
Affordance RecognitionHICO-DETCOCO-Val201772.08SCL
Affordance RecognitionHICO-DETHICO82.47SCL
Affordance RecognitionHICO-DETNovel classes18.55SCL
Affordance RecognitionHICO-DETObject36557.53SCL
Human-Object Interaction Concept DiscoveryHICO-DETUnknown (AP)33.58SCL

Related Papers

RoHOI: Robustness Benchmark for Human-Object Interaction Detection2025-07-12Bilateral Collaboration with Large Vision-Language Models for Open Vocabulary Human-Object Interaction Detection2025-07-09VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and Collisions2025-06-29HOIverse: A Synthetic Scene Graph Dataset With Human Object Interactions2025-06-24On the Robustness of Human-Object Interaction Detection against Distribution Shift2025-06-22Egocentric Human-Object Interaction Detection: A New Benchmark and Method2025-06-17InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions2025-06-11HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation2025-06-10