VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision

Mengyin Liu, Jie Jiang, Chao Zhu, Xu-Cheng Yin

2023-04-06CVPR 2023 1Autonomous Driving Pedestrian Detection

Abstract

Detecting pedestrians accurately in urban scenes is significant for realistic applications like autonomous driving or video surveillance. However, confusing human-like objects often lead to wrong detections, and small scale or heavily occluded pedestrians are easily missed due to their unusual appearances. To address these challenges, only object regions are inadequate, thus how to fully utilize more explicit and semantic contexts becomes a key problem. Meanwhile, previous context-aware pedestrian detectors either only learn latent contexts with visual clues, or need laborious annotations to obtain explicit and semantic contexts. Therefore, we propose in this paper a novel approach via Vision-Language semantic self-supervision for context-aware Pedestrian Detection (VLPD) to model explicitly semantic contexts without any extra annotations. Firstly, we propose a self-supervised Vision-Language Semantic (VLS) segmentation method, which learns both fully-supervised pedestrian detection and contextual segmentation via self-generated explicit labels of semantic classes by vision-language models. Furthermore, a self-supervised Prototypical Semantic Contrastive (PSC) learning method is proposed to better discriminate pedestrians and other classes, based on more explicit and semantic contexts obtained from VLS. Extensive experiments on popular benchmarks show that our proposed VLPD achieves superior performances over the previous state-of-the-arts, particularly under challenging circumstances like small scale and heavy occlusion. Code is available at https://github.com/lmy98129/VLPD.

Results

Task	Dataset	Metric	Value	Model
Autonomous Vehicles	Caltech	Heavy MR^-2	37.7	VLPD
Autonomous Vehicles	Caltech	Reasonable Miss Rate	2.3	VLPD
Autonomous Vehicles	CityPersons	Bare MR^-2	6.1	VLPD
Autonomous Vehicles	CityPersons	Heavy MR^-2	43.1	VLPD
Autonomous Vehicles	CityPersons	Partial MR^-2	8.8	VLPD
Autonomous Vehicles	CityPersons	Reasonable MR^-2	9.4	VLPD
Autonomous Vehicles	CityPersons	Small MR^-2	10.9	VLPD
Pedestrian Detection	Caltech	Heavy MR^-2	37.7	VLPD
Pedestrian Detection	Caltech	Reasonable Miss Rate	2.3	VLPD
Pedestrian Detection	CityPersons	Bare MR^-2	6.1	VLPD
Pedestrian Detection	CityPersons	Heavy MR^-2	43.1	VLPD
Pedestrian Detection	CityPersons	Partial MR^-2	8.8	VLPD
Pedestrian Detection	CityPersons	Reasonable MR^-2	9.4	VLPD
Pedestrian Detection	CityPersons	Small MR^-2	10.9	VLPD

VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision

Abstract

Results

Related Papers

VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision

Abstract

Results

Related Papers