TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/LION: Linear Group RNN for 3D Object Detection in Point Cl...

LION: Linear Group RNN for 3D Object Detection in Point Clouds

Zhe Liu, Jinghua Hou, Xinyu Wang, Xiaoqing Ye, Jingdong Wang, Hengshuang Zhao, Xiang Bai

2024-07-25Long-range modelingobject-detection3D Object DetectionObject Detection
PaperPDFCode(official)

Abstract

The benefit of transformers in large-scale 3D point cloud perception tasks, such as 3D object detection, is limited by their quadratic computation cost when modeling long-range relationships. In contrast, linear RNNs have low computational complexity and are suitable for long-range modeling. Toward this goal, we propose a simple and effective window-based framework built on LInear grOup RNN (i.e., perform linear RNN for grouped features) for accurate 3D object detection, called LION. The key property is to allow sufficient feature interaction in a much larger group than transformer-based methods. However, effectively applying linear group RNN to 3D object detection in highly sparse point clouds is not trivial due to its limitation in handling spatial modeling. To tackle this problem, we simply introduce a 3D spatial feature descriptor and integrate it into the linear group RNN operators to enhance their spatial features rather than blindly increasing the number of scanning orders for voxel features. To further address the challenge in highly sparse point clouds, we propose a 3D voxel generation strategy to densify foreground features thanks to linear group RNN as a natural property of auto-regressive models. Extensive experiments verify the effectiveness of the proposed components and the generalization of our LION on different linear group RNN operators including Mamba, RWKV, and RetNet. Furthermore, it is worth mentioning that our LION-Mamba achieves state-of-the-art on Waymo, nuScenes, Argoverse V2, and ONCE dataset. Last but not least, our method supports kinds of advanced linear RNN operators (e.g., RetNet, RWKV, Mamba, xLSTM and TTT) on small but popular KITTI dataset for a quick experience with our linear RNN-based framework.

Results

TaskDatasetMetricValueModel
Object DetectionnuScenes LiDAR onlyNDS73.9LION
Object DetectionnuScenes LiDAR onlyNDS (val)72.1LION
Object DetectionnuScenes LiDAR onlymAP69.8LION
Object DetectionnuScenes LiDAR onlymAP (val)68LION
Object DetectionONCE mAP66.6LION
Object DetectionArgoverse2mAP41.5LION
Object DetectionWaymo Open DatasetmAPH/L274LION
3DnuScenes LiDAR onlyNDS73.9LION
3DnuScenes LiDAR onlyNDS (val)72.1LION
3DnuScenes LiDAR onlymAP69.8LION
3DnuScenes LiDAR onlymAP (val)68LION
3DONCE mAP66.6LION
3DArgoverse2mAP41.5LION
3DWaymo Open DatasetmAPH/L274LION
3D Object DetectionnuScenes LiDAR onlyNDS73.9LION
3D Object DetectionnuScenes LiDAR onlyNDS (val)72.1LION
3D Object DetectionnuScenes LiDAR onlymAP69.8LION
3D Object DetectionnuScenes LiDAR onlymAP (val)68LION
3D Object DetectionONCE mAP66.6LION
3D Object DetectionArgoverse2mAP41.5LION
3D Object DetectionWaymo Open DatasetmAPH/L274LION
2D ClassificationnuScenes LiDAR onlyNDS73.9LION
2D ClassificationnuScenes LiDAR onlyNDS (val)72.1LION
2D ClassificationnuScenes LiDAR onlymAP69.8LION
2D ClassificationnuScenes LiDAR onlymAP (val)68LION
2D ClassificationONCE mAP66.6LION
2D ClassificationArgoverse2mAP41.5LION
2D ClassificationWaymo Open DatasetmAPH/L274LION
2D Object DetectionnuScenes LiDAR onlyNDS73.9LION
2D Object DetectionnuScenes LiDAR onlyNDS (val)72.1LION
2D Object DetectionnuScenes LiDAR onlymAP69.8LION
2D Object DetectionnuScenes LiDAR onlymAP (val)68LION
2D Object DetectionONCE mAP66.6LION
2D Object DetectionArgoverse2mAP41.5LION
2D Object DetectionWaymo Open DatasetmAPH/L274LION
16knuScenes LiDAR onlyNDS73.9LION
16knuScenes LiDAR onlyNDS (val)72.1LION
16knuScenes LiDAR onlymAP69.8LION
16knuScenes LiDAR onlymAP (val)68LION
16kONCE mAP66.6LION
16kArgoverse2mAP41.5LION
16kWaymo Open DatasetmAPH/L274LION

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16U-RWKV: Lightweight medical image segmentation with direction-adaptive RWKV2025-07-15Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models2025-07-14