DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors

Yilun Chen, Shijia Huang, Shu Liu, Bei Yu, Jiaya Jia

2022-04-063D geometry 3D Object Detection From Stereo Images cross-modal alignment 3D Object Detection

Abstract

Camera-based 3D object detectors are welcome due to their wider deployment and lower price than LiDAR sensors. We first revisit the prior stereo detector DSGN for its stereo volume construction ways for representing both 3D geometry and semantics. We polish the stereo modeling and propose the advanced version, DSGN++, aiming to enhance effective information flow throughout the 2D-to-3D pipeline in three main aspects. First, to effectively lift the 2D information to stereo volume, we propose depth-wise plane sweeping (DPS) that allows denser connections and extracts depth-guided features. Second, for grasping differently spaced features, we present a novel stereo volume -- Dual-view Stereo Volume (DSV) that integrates front-view and top-view features and reconstructs sub-voxel depth in the camera frustum. Third, as the foreground region becomes less dominant in 3D space, we propose a multi-modal data editing strategy -- Stereo-LiDAR Copy-Paste, which ensures cross-modal alignment and improves data efficiency. Without bells and whistles, extensive experiments in various modality setups on the popular KITTI benchmark show that our method consistently outperforms other camera-based 3D detectors for all categories. Code is available at https://github.com/chenyilun95/DSGN2.

Results

Task	Dataset	Metric	Value	Model
Object Detection	KITTI Cars Moderate	AP75	67.37	DSGN++
Object Detection	KITTI Cyclists Moderate	AP50	43.9	DSGN++
Object Detection	KITTI Pedestrians Moderate	AP50	32.74	DSGN++
3D	KITTI Cars Moderate	AP75	67.37	DSGN++
3D	KITTI Cyclists Moderate	AP50	43.9	DSGN++
3D	KITTI Pedestrians Moderate	AP50	32.74	DSGN++
3D Object Detection	KITTI Cars Moderate	AP75	67.37	DSGN++
3D Object Detection	KITTI Cyclists Moderate	AP50	43.9	DSGN++
3D Object Detection	KITTI Pedestrians Moderate	AP50	32.74	DSGN++
2D Classification	KITTI Cars Moderate	AP75	67.37	DSGN++
2D Classification	KITTI Cyclists Moderate	AP50	43.9	DSGN++
2D Classification	KITTI Pedestrians Moderate	AP50	32.74	DSGN++
2D Object Detection	KITTI Cars Moderate	AP75	67.37	DSGN++
2D Object Detection	KITTI Cyclists Moderate	AP50	43.9	DSGN++
2D Object Detection	KITTI Pedestrians Moderate	AP50	32.74	DSGN++
16k	KITTI Cars Moderate	AP75	67.37	DSGN++
16k	KITTI Cyclists Moderate	AP50	43.9	DSGN++
16k	KITTI Pedestrians Moderate	AP50	32.74	DSGN++

DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors

Abstract

Results

Related Papers

DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors

Abstract

Results

Related Papers