TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Scaling Diffusion Models to Real-World 3D LiDAR Scene Comp...

Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion

Lucas Nunes, Rodrigo Marcuzzi, Benedikt Mersch, Jens Behley, Cyrill Stachniss

2024-03-20CVPR 2024 1DenoisingAutonomous VehiclesLidar Scene Completion
PaperPDFCode(official)

Abstract

Computer vision techniques play a central role in the perception stack of autonomous vehicles. Such methods are employed to perceive the vehicle surroundings given sensor data. 3D LiDAR sensors are commonly used to collect sparse 3D point clouds from the scene. However, compared to human perception, such systems struggle to deduce the unseen parts of the scene given those sparse point clouds. In this matter, the scene completion task aims at predicting the gaps in the LiDAR measurements to achieve a more complete scene representation. Given the promising results of recent diffusion models as generative models for images, we propose extending them to achieve scene completion from a single 3D LiDAR scan. Previous works used diffusion models over range images extracted from LiDAR data, directly applying image-based diffusion methods. Distinctly, we propose to directly operate on the points, reformulating the noising and denoising diffusion process such that it can efficiently work at scene scale. Together with our approach, we propose a regularization loss to stabilize the noise predicted during the denoising process. Our experimental evaluation shows that our method can complete the scene given a single LiDAR scan as input, producing a scene with more details compared to state-of-the-art scene completion methods. We believe that our proposed diffusion process formulation can support further research in diffusion models applied to scene-scale point cloud data.

Results

TaskDatasetMetricValueModel
Lidar Scene CompletionSemanticKITTIChamfer Distance0.376LiDiff (refined)
Lidar Scene CompletionSemanticKITTIJSD 3D0.573LiDiff (refined)
Lidar Scene CompletionSemanticKITTIJSD BEV0.416LiDiff (refined)
Lidar Scene CompletionSemanticKITTIVoxel IoU 0.1m13.4LiDiff (refined)
Lidar Scene CompletionSemanticKITTIVoxel IoU 0.2m22.99LiDiff (refined)
Lidar Scene CompletionSemanticKITTIVoxel IoU 0.5m32.43LiDiff (refined)
Lidar Scene CompletionSemanticKITTIChamfer Distance0.434LiDiff
Lidar Scene CompletionSemanticKITTIJSD 3D0.564LiDiff
Lidar Scene CompletionSemanticKITTIJSD BEV0.444LiDiff
Lidar Scene CompletionSemanticKITTIVoxel IoU 0.1m4.67LiDiff
Lidar Scene CompletionSemanticKITTIVoxel IoU 0.2m16.79LiDiff
Lidar Scene CompletionSemanticKITTIVoxel IoU 0.5m31.47LiDiff

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16HUG-VAS: A Hierarchical NURBS-Based Generative Model for Aortic Geometry Synthesis and Controllable Editing2025-07-15AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air2025-07-15A statistical physics framework for optimal learning2025-07-10LangMamba: A Language-driven Mamba Framework for Low-dose CT Denoising with Vision-language Models2025-07-08