TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/VoxelKP: A Voxel-based Network Architecture for Human Keyp...

VoxelKP: A Voxel-based Network Architecture for Human Keypoint Estimation in LiDAR Data

Jian Shi, Peter Wonka

2023-12-113D Human Pose EstimationKeypoint DetectionKeypoint Estimation
PaperPDFCode(official)

Abstract

We present \textit{VoxelKP}, a novel fully sparse network architecture tailored for human keypoint estimation in LiDAR data. The key challenge is that objects are distributed sparsely in 3D space, while human keypoint detection requires detailed local information wherever humans are present. We propose four novel ideas in this paper. First, we propose sparse selective kernels to capture multi-scale context. Second, we introduce sparse box-attention to focus on learning spatial correlations between keypoints within each human instance. Third, we incorporate a spatial encoding to leverage absolute 3D coordinates when projecting 3D voxels to a 2D grid encoding a bird's eye view. Finally, we propose hybrid feature learning to combine the processing of per-voxel features with sparse convolution. We evaluate our method on the Waymo dataset and achieve an improvement of $27\%$ on the MPJPE metric compared to the state-of-the-art, \textit{HUM3DIL}, trained on the same data, and $12\%$ against the state-of-the-art, \textit{GC-KPL}, pretrained on a $25\times$ larger dataset. To the best of our knowledge, \textit{VoxelKP} is the first single-staged, fully sparse network that is specifically designed for addressing the challenging task of 3D keypoint estimation from LiDAR data, achieving state-of-the-art performances. Our code is available at \url{https://github.com/shijianjian/VoxelKP}.

Results

TaskDatasetMetricValueModel
3D Human Pose EstimationWaymo Open DatasetMPJPE8.87VoxelKP
Pose EstimationWaymo Open DatasetMPJPE8.87VoxelKP
3DWaymo Open DatasetMPJPE8.87VoxelKP
1 Image, 2*2 StitchiWaymo Open DatasetMPJPE8.87VoxelKP

Related Papers

KptLLM++: Towards Generic Keypoint Comprehension with Large Language Model2025-07-15GKNet: Graph-based Keypoints Network for Monocular Pose Estimation of Non-cooperative Spacecraft2025-07-15FPC-Net: Revisiting SuperPoint with Descriptor-Free Keypoint Detection via Feature Pyramids and Consistency-Based Implicit Matching2025-07-14Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection2025-07-10Attend-and-Refine: Interactive keypoint estimation and quantitative cervical vertebrae analysis for bone age assessment2025-07-10Reading a Ruler in the Wild2025-07-09MK-Pose: Category-Level Object Pose Estimation via Multimodal-Based Keypoint Learning2025-07-09Systematic Comparison of Projection Methods for Monocular 3D Human Pose Estimation on Fisheye Images2025-06-24