TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Detecting Human-Object Interaction via Fabricated Composit...

Detecting Human-Object Interaction via Fabricated Compositional Learning

Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, DaCheng Tao

2021-03-15CVPR 2021 1Affordance RecognitionHuman-Object Interaction DetectionScene Understanding
PaperPDFCode(official)

Abstract

Human-Object Interaction (HOI) detection, inferring the relationships between human and objects from images/videos, is a fundamental task for high-level scene understanding. However, HOI detection usually suffers from the open long-tailed nature of interactions with objects, while human has extremely powerful compositional perception ability to cognize rare or unseen HOI samples. Inspired by this, we devise a novel HOI compositional learning framework, termed as Fabricated Compositional Learning (FCL), to address the problem of open long-tailed HOI detection. Specifically, we introduce an object fabricator to generate effective object representations, and then combine verbs and fabricated objects to compose new HOI samples. With the proposed object fabricator, we are able to generate large-scale HOI samples for rare and unseen categories to alleviate the open long-tailed issues in HOI detection. Extensive experiments on the most popular HOI detection dataset, HICO-DET, demonstrate the effectiveness of the proposed method for imbalanced HOI detection and significantly improve the state-of-the-art performance on rare and unseen HOI categories. Code is available at https://github.com/zhihou7/HOI-CL.

Results

TaskDatasetMetricValueModel
Human-Object Interaction DetectionHICO-DETCOCO-Val201725.11FCL
Human-Object Interaction DetectionHICO-DETHICO37.32FCL
Human-Object Interaction DetectionHICO-DETNovel classes6.8FCL
Human-Object Interaction DetectionHICO-DETObject36525.21FCL
Affordance RecognitionHICO-DETCOCO-Val201725.11FCL
Affordance RecognitionHICO-DETHICO37.32FCL
Affordance RecognitionHICO-DETNovel classes6.8FCL
Affordance RecognitionHICO-DETObject36525.21FCL

Related Papers

Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation2025-07-15Tactical Decision for Multi-UGV Confrontation with a Vision-Language Model-Based Commander2025-07-15Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis2025-07-15EmbRACE-3K: Embodied Reasoning and Action in Complex Environments2025-07-14RoHOI: Robustness Benchmark for Human-Object Interaction Detection2025-07-12