TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Sc...

AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Scene Understanding

Hongcheng Yang, Dingkang Liang, Dingyuan Zhang, Zhe Liu, Zhikang Zou, Xingyu Jiang, Yingying Zhu

2024-02-27Scene UnderstandingSemantic Segmentation3D Semantic Segmentation3D Part Segmentationobject-detection3D Object DetectionObject Detection
PaperPDFCode(official)

Abstract

The recent advancements in point cloud learning have enabled intelligent vehicles and robots to comprehend 3D environments better. However, processing large-scale 3D scenes remains a challenging problem, such that efficient downsampling methods play a crucial role in point cloud learning. Existing downsampling methods either require a huge computational burden or sacrifice fine-grained geometric information. For such purpose, this paper presents an advanced sampler that achieves both high accuracy and efficiency. The proposed method utilizes voxel centroid sampling as a foundation but effectively addresses the challenges regarding voxel size determination and the preservation of critical geometric cues. Specifically, we propose a Voxel Adaptation Module that adaptively adjusts voxel sizes with the reference of point-based downsampling ratio. This ensures that the sampling results exhibit a favorable distribution for comprehending various 3D objects or scenes. Meanwhile, we introduce a network compatible with arbitrary voxel sizes for sampling and feature extraction while maintaining high efficiency. The proposed approach is demonstrated with 3D object detection and 3D semantic segmentation. Compared to existing state-of-the-art methods, our approach achieves better accuracy on outdoor and indoor large-scale datasets, e.g. Waymo and ScanNet, with promising efficiency.

Results

TaskDatasetMetricValueModel
Semantic SegmentationScanNetval mIoU76.1AVS-Net
Semantic SegmentationShapeNet-PartClass Average IoU85.7AVS-Net
Semantic SegmentationShapeNet-PartInstance Average IoU87.3AVS-Net
Object DetectionWaymo Open DatasetmAPH/L272.4AVS-Net
3DWaymo Open DatasetmAPH/L272.4AVS-Net
3D Object DetectionWaymo Open DatasetmAPH/L272.4AVS-Net
2D ClassificationWaymo Open DatasetmAPH/L272.4AVS-Net
2D Object DetectionWaymo Open DatasetmAPH/L272.4AVS-Net
10-shot image generationScanNetval mIoU76.1AVS-Net
10-shot image generationShapeNet-PartClass Average IoU85.7AVS-Net
10-shot image generationShapeNet-PartInstance Average IoU87.3AVS-Net
16kWaymo Open DatasetmAPH/L272.4AVS-Net

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17