CASPNet++: Joint Multi-Agent Motion Prediction

Maximilian Schäfer, Kun Zhao, Anton Kummert

2023-08-15motion prediction Scene Understanding Autonomous Driving Prediction Trajectory Prediction

Abstract

The prediction of road users' future motion is a critical task in supporting advanced driver-assistance systems (ADAS). It plays an even more crucial role for autonomous driving (AD) in enabling the planning and execution of safe driving maneuvers. Based on our previous work, Context-Aware Scene Prediction Network (CASPNet), an improved system, CASPNet++, is proposed. In this work, we focus on further enhancing the interaction modeling and scene understanding to support the joint prediction of all road users in a scene using spatiotemporal grids to model future occupancy. Moreover, an instance-based output head is introduced to provide multi-modal trajectories for agents of interest. In extensive quantitative and qualitative analysis, we demonstrate the scalability of CASPNet++ in utilizing and fusing diverse environmental input sources such as HD maps, Radar detection, and Lidar segmentation. Tested on the urban-focused prediction dataset nuScenes, CASPNet++ reaches state-of-the-art performance. The model has been deployed in a testing vehicle, running in real-time with moderate computational resources.

Results

Task	Dataset	Metric	Value	Model
Trajectory Prediction	nuScenes	MinADE_10	0.92	CASPNet++
Trajectory Prediction	nuScenes	MinADE_5	1.16	CASPNet++
Trajectory Prediction	nuScenes	MinFDE_1	6.18	CASPNet++
Trajectory Prediction	nuScenes	MissRateTopK_2_10	0.29	CASPNet++
Trajectory Prediction	nuScenes	MissRateTopK_2_5	0.5	CASPNet++
Trajectory Prediction	nuScenes	OffRoadRate	0.01	CASPNet++

Related Papers

Multi-Strategy Improved Snake Optimizer Accelerated CNN-LSTM-Attention-Adaboost for Trajectory Prediction2025-07-21 GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving2025-07-19 AGENTS-LLM: Augmentative GENeration of Challenging Traffic Scenarios with an Agentic LLM Framework2025-07-18 Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17 Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17 City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17 World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17 Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models2025-07-17