TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/HiLo: Exploiting High Low Frequency Relations for Unbiased...

HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation

Zijian Zhou, Miaojing Shi, Holger Caesar

2023-03-28ICCV 2023 1Scene Graph GenerationPanoptic Scene Graph GenerationScene Understanding
PaperPDFCode(official)

Abstract

Panoptic Scene Graph generation (PSG) is a recently proposed task in image scene understanding that aims to segment the image and extract triplets of subjects, objects and their relations to build a scene graph. This task is particularly challenging for two reasons. First, it suffers from a long-tail problem in its relation categories, making naive biased methods more inclined to high-frequency relations. Existing unbiased methods tackle the long-tail problem by data/loss rebalancing to favor low-frequency relations. Second, a subject-object pair can have two or more semantically overlapping relations. While existing methods favor one over the other, our proposed HiLo framework lets different network branches specialize on low and high frequency relations, enforce their consistency and fuse the results. To the best of our knowledge we are the first to propose an explicitly unbiased PSG method. In extensive experiments we show that our HiLo framework achieves state-of-the-art results on the PSG task. We also apply our method to the Scene Graph Generation task that predicts boxes instead of masks and see improvements over all baseline methods. Code is available at https://github.com/franciszzj/HiLo.

Results

TaskDatasetMetricValueModel
Scene ParsingPSG DatasetR@2040.6HiLo(SwinL)
Scene ParsingPSG DatasetmR@2029.7HiLo(SwinL)
Scene ParsingPSG DatasetR@2034.1HiLo(R50)
Scene ParsingPSG DatasetmR@2023.7HiLo(R50)
2D Semantic SegmentationPSG DatasetR@2040.6HiLo(SwinL)
2D Semantic SegmentationPSG DatasetmR@2029.7HiLo(SwinL)
2D Semantic SegmentationPSG DatasetR@2034.1HiLo(R50)
2D Semantic SegmentationPSG DatasetmR@2023.7HiLo(R50)
Scene Graph GenerationPSG DatasetR@2040.6HiLo(SwinL)
Scene Graph GenerationPSG DatasetmR@2029.7HiLo(SwinL)
Scene Graph GenerationPSG DatasetR@2034.1HiLo(R50)
Scene Graph GenerationPSG DatasetmR@2023.7HiLo(R50)

Related Papers

Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation2025-07-15Tactical Decision for Multi-UGV Confrontation with a Vision-Language Model-Based Commander2025-07-15Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis2025-07-15EmbRACE-3K: Embodied Reasoning and Action in Complex Environments2025-07-14OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding2025-07-10