TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Object-Aware Distillation Pyramid for Open-Vocabulary Obje...

Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection

Luting Wang, Yi Liu, Penghui Du, Zihan Ding, Yue Liao, Qiaosong Qi, Biaolong Chen, Si Liu

2023-03-10CVPR 2023 1Open Vocabulary Object Detection
PaperPDFCode(official)

Abstract

Open-vocabulary object detection aims to provide object detectors trained on a fixed set of object categories with the generalizability to detect objects described by arbitrary text queries. Previous methods adopt knowledge distillation to extract knowledge from Pretrained Vision-and-Language Models (PVLMs) and transfer it to detectors. However, due to the non-adaptive proposal cropping and single-level feature mimicking processes, they suffer from information destruction during knowledge extraction and inefficient knowledge transfer. To remedy these limitations, we propose an Object-Aware Distillation Pyramid (OADP) framework, including an Object-Aware Knowledge Extraction (OAKE) module and a Distillation Pyramid (DP) mechanism. When extracting object knowledge from PVLMs, the former adaptively transforms object proposals and adopts object-aware mask attention to obtain precise and complete knowledge of objects. The latter introduces global and block distillation for more comprehensive knowledge transfer to compensate for the missing relation information in object distillation. Extensive experiments show that our method achieves significant improvement compared to current methods. Especially on the MS-COCO dataset, our OADP framework reaches $35.6$ mAP$^{\text{N}}_{50}$, surpassing the current state-of-the-art method by $3.3$ mAP$^{\text{N}}_{50}$. Code is released at https://github.com/LutingWang/OADP.

Results

TaskDatasetMetricValueModel
Object DetectionLVIS v1.0AP novel-LVIS base training21.7OADP
Object DetectionMSCOCOAP 0.535.6OADP (G-OVD)
Object DetectionMSCOCOAP 0.530OADP
3DLVIS v1.0AP novel-LVIS base training21.7OADP
3DMSCOCOAP 0.535.6OADP (G-OVD)
3DMSCOCOAP 0.530OADP
2D ClassificationLVIS v1.0AP novel-LVIS base training21.7OADP
2D ClassificationMSCOCOAP 0.535.6OADP (G-OVD)
2D ClassificationMSCOCOAP 0.530OADP
2D Object DetectionLVIS v1.0AP novel-LVIS base training21.7OADP
2D Object DetectionMSCOCOAP 0.535.6OADP (G-OVD)
2D Object DetectionMSCOCOAP 0.530OADP
Open Vocabulary Object DetectionLVIS v1.0AP novel-LVIS base training21.7OADP
Open Vocabulary Object DetectionMSCOCOAP 0.535.6OADP (G-OVD)
Open Vocabulary Object DetectionMSCOCOAP 0.530OADP
16kLVIS v1.0AP novel-LVIS base training21.7OADP
16kMSCOCOAP 0.535.6OADP (G-OVD)
16kMSCOCOAP 0.530OADP

Related Papers

ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction2025-06-10Gen-n-Val: Agentic Image Data Generation and Validation2025-06-05From Data to Modeling: Fully Open-vocabulary Scene Graph Generation2025-05-26FG-CLIP: Fine-Grained Visual and Textual Alignment2025-05-08VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model2025-04-10An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection2025-03-21Superpowering Open-Vocabulary Object Detectors for X-ray Vision2025-03-21Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark2025-03-19