Goodarz Mehr, Azim Eskandarian
Bird's-eye view (BEV) perception has garnered significant attention in autonomous driving in recent years, in part because BEV representation facilitates multi-modal sensor fusion. BEV representation enables a variety of perception tasks including BEV segmentation, a concise view of the environment useful for planning a vehicle's trajectory. However, this representation is not fully supported by existing datasets, and creation of new datasets for this purpose can be a time-consuming endeavor. To address this challenge, we introduce SimBEV. SimBEV is a randomized synthetic data generation tool that is extensively configurable and scalable, supports a wide array of sensors, incorporates information from multiple sources to capture accurate BEV ground truth, and enables a variety of perception tasks including BEV segmentation and 3D object detection. SimBEV is used to create the SimBEV dataset, a large collection of annotated perception data from diverse driving scenarios. SimBEV and the SimBEV dataset are open and available to the public.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | SimBEV | bicycle | 0.036 | BEVFusion |
| Semantic Segmentation | SimBEV | bus | 0.808 | BEVFusion |
| Semantic Segmentation | SimBEV | car | 0.727 | BEVFusion |
| Semantic Segmentation | SimBEV | mIoU | 0.5 | BEVFusion |
| Semantic Segmentation | SimBEV | motorcycle | 0.363 | BEVFusion |
| Semantic Segmentation | SimBEV | pedestrian | 0.202 | BEVFusion |
| Semantic Segmentation | SimBEV | rider | 0.233 | BEVFusion |
| Semantic Segmentation | SimBEV | road | 0.884 | BEVFusion |
| Semantic Segmentation | SimBEV | truck | 0.745 | BEVFusion |
| Semantic Segmentation | SimBEV | bicycle | 0.114 | UniTR |
| Semantic Segmentation | SimBEV | bus | 0.517 | UniTR |
| Semantic Segmentation | SimBEV | car | 0.738 | UniTR |
| Semantic Segmentation | SimBEV | mIoU | 0.497 | UniTR |
| Semantic Segmentation | SimBEV | motorcycle | 0.365 | UniTR |
| Semantic Segmentation | SimBEV | pedestrian | 0.275 | UniTR |
| Semantic Segmentation | SimBEV | rider | 0.362 | UniTR |
| Semantic Segmentation | SimBEV | road | 0.928 | UniTR |
| Semantic Segmentation | SimBEV | truck | 0.677 | UniTR |
| Semantic Segmentation | SimBEV | bicycle | 0.036 | BEVFusion-L |
| Semantic Segmentation | SimBEV | bus | 0.815 | BEVFusion-L |
| Semantic Segmentation | SimBEV | car | 0.706 | BEVFusion-L |
| Semantic Segmentation | SimBEV | mIoU | 0.483 | BEVFusion-L |
| Semantic Segmentation | SimBEV | motorcycle | 0.325 | BEVFusion-L |
| Semantic Segmentation | SimBEV | pedestrian | 0.189 | BEVFusion-L |
| Semantic Segmentation | SimBEV | rider | 0.184 | BEVFusion-L |
| Semantic Segmentation | SimBEV | road | 0.877 | BEVFusion-L |
| Semantic Segmentation | SimBEV | truck | 0.735 | BEVFusion-L |
| Semantic Segmentation | SimBEV | bicycle | 0.063 | UniTR+LSS |
| Semantic Segmentation | SimBEV | bus | 0.585 | UniTR+LSS |
| Semantic Segmentation | SimBEV | car | 0.728 | UniTR+LSS |
| Semantic Segmentation | SimBEV | mIoU | 0.476 | UniTR+LSS |
| Semantic Segmentation | SimBEV | motorcycle | 0.359 | UniTR+LSS |
| Semantic Segmentation | SimBEV | pedestrian | 0.129 | UniTR+LSS |
| Semantic Segmentation | SimBEV | rider | 0.316 | UniTR+LSS |
| Semantic Segmentation | SimBEV | road | 0.933 | UniTR+LSS |
| Semantic Segmentation | SimBEV | truck | 0.694 | UniTR+LSS |
| Semantic Segmentation | SimBEV | bus | 0.229 | BEVFusion-C |
| Semantic Segmentation | SimBEV | car | 0.172 | BEVFusion-C |
| Semantic Segmentation | SimBEV | mIoU | 0.152 | BEVFusion-C |
| Semantic Segmentation | SimBEV | road | 0.76 | BEVFusion-C |
| Semantic Segmentation | SimBEV | truck | 0.051 | BEVFusion-C |
| Object Detection | SimBEV | SDS | 0.622 | UniTR+LSS |
| Object Detection | SimBEV | mAOE | 0.207 | UniTR+LSS |
| Object Detection | SimBEV | mAP | 0.478 | UniTR+LSS |
| Object Detection | SimBEV | mASE | 0.085 | UniTR+LSS |
| Object Detection | SimBEV | mATE | 0.113 | UniTR+LSS |
| Object Detection | SimBEV | mAVE | 0.53 | UniTR+LSS |
| Object Detection | SimBEV | SDS | 0.617 | UniTR |
| Object Detection | SimBEV | mAOE | 0.224 | UniTR |
| Object Detection | SimBEV | mAP | 0.477 | UniTR |
| Object Detection | SimBEV | mASE | 0.09 | UniTR |
| Object Detection | SimBEV | mATE | 0.113 | UniTR |
| Object Detection | SimBEV | mAVE | 0.55 | UniTR |
| Object Detection | SimBEV | SDS | 0.566 | BEVFusion |
| Object Detection | SimBEV | mAOE | 0.122 | BEVFusion |
| Object Detection | SimBEV | mAP | 0.481 | BEVFusion |
| Object Detection | SimBEV | mASE | 0.127 | BEVFusion |
| Object Detection | SimBEV | mATE | 0.146 | BEVFusion |
| Object Detection | SimBEV | mAVE | 1.54 | BEVFusion |
| Object Detection | SimBEV | SDS | 0.564 | BEVFusion-L |
| Object Detection | SimBEV | mAOE | 0.133 | BEVFusion-L |
| Object Detection | SimBEV | mAP | 0.481 | BEVFusion-L |
| Object Detection | SimBEV | mASE | 0.134 | BEVFusion-L |
| Object Detection | SimBEV | mATE | 0.144 | BEVFusion-L |
| Object Detection | SimBEV | mAVE | 1.56 | BEVFusion-L |
| Object Detection | SimBEV | SDS | 0.251 | BEVFusion-C |
| Object Detection | SimBEV | mAOE | 1.044 | BEVFusion-C |
| Object Detection | SimBEV | mAP | 0.221 | BEVFusion-C |
| Object Detection | SimBEV | mASE | 0.137 | BEVFusion-C |
| Object Detection | SimBEV | mATE | 0.744 | BEVFusion-C |
| Object Detection | SimBEV | mAVE | 4.65 | BEVFusion-C |
| 3D | SimBEV | SDS | 0.622 | UniTR+LSS |
| 3D | SimBEV | mAOE | 0.207 | UniTR+LSS |
| 3D | SimBEV | mAP | 0.478 | UniTR+LSS |
| 3D | SimBEV | mASE | 0.085 | UniTR+LSS |
| 3D | SimBEV | mATE | 0.113 | UniTR+LSS |
| 3D | SimBEV | mAVE | 0.53 | UniTR+LSS |
| 3D | SimBEV | SDS | 0.617 | UniTR |
| 3D | SimBEV | mAOE | 0.224 | UniTR |
| 3D | SimBEV | mAP | 0.477 | UniTR |
| 3D | SimBEV | mASE | 0.09 | UniTR |
| 3D | SimBEV | mATE | 0.113 | UniTR |
| 3D | SimBEV | mAVE | 0.55 | UniTR |
| 3D | SimBEV | SDS | 0.566 | BEVFusion |
| 3D | SimBEV | mAOE | 0.122 | BEVFusion |
| 3D | SimBEV | mAP | 0.481 | BEVFusion |
| 3D | SimBEV | mASE | 0.127 | BEVFusion |
| 3D | SimBEV | mATE | 0.146 | BEVFusion |
| 3D | SimBEV | mAVE | 1.54 | BEVFusion |
| 3D | SimBEV | SDS | 0.564 | BEVFusion-L |
| 3D | SimBEV | mAOE | 0.133 | BEVFusion-L |
| 3D | SimBEV | mAP | 0.481 | BEVFusion-L |
| 3D | SimBEV | mASE | 0.134 | BEVFusion-L |
| 3D | SimBEV | mATE | 0.144 | BEVFusion-L |
| 3D | SimBEV | mAVE | 1.56 | BEVFusion-L |
| 3D | SimBEV | SDS | 0.251 | BEVFusion-C |
| 3D | SimBEV | mAOE | 1.044 | BEVFusion-C |
| 3D | SimBEV | mAP | 0.221 | BEVFusion-C |
| 3D | SimBEV | mASE | 0.137 | BEVFusion-C |
| 3D | SimBEV | mATE | 0.744 | BEVFusion-C |
| 3D | SimBEV | mAVE | 4.65 | BEVFusion-C |
| 3D Object Detection | SimBEV | SDS | 0.622 | UniTR+LSS |
| 3D Object Detection | SimBEV | mAOE | 0.207 | UniTR+LSS |
| 3D Object Detection | SimBEV | mAP | 0.478 | UniTR+LSS |
| 3D Object Detection | SimBEV | mASE | 0.085 | UniTR+LSS |
| 3D Object Detection | SimBEV | mATE | 0.113 | UniTR+LSS |
| 3D Object Detection | SimBEV | mAVE | 0.53 | UniTR+LSS |
| 3D Object Detection | SimBEV | SDS | 0.617 | UniTR |
| 3D Object Detection | SimBEV | mAOE | 0.224 | UniTR |
| 3D Object Detection | SimBEV | mAP | 0.477 | UniTR |
| 3D Object Detection | SimBEV | mASE | 0.09 | UniTR |
| 3D Object Detection | SimBEV | mATE | 0.113 | UniTR |
| 3D Object Detection | SimBEV | mAVE | 0.55 | UniTR |
| 3D Object Detection | SimBEV | SDS | 0.566 | BEVFusion |
| 3D Object Detection | SimBEV | mAOE | 0.122 | BEVFusion |
| 3D Object Detection | SimBEV | mAP | 0.481 | BEVFusion |
| 3D Object Detection | SimBEV | mASE | 0.127 | BEVFusion |
| 3D Object Detection | SimBEV | mATE | 0.146 | BEVFusion |
| 3D Object Detection | SimBEV | mAVE | 1.54 | BEVFusion |
| 3D Object Detection | SimBEV | SDS | 0.564 | BEVFusion-L |
| 3D Object Detection | SimBEV | mAOE | 0.133 | BEVFusion-L |
| 3D Object Detection | SimBEV | mAP | 0.481 | BEVFusion-L |
| 3D Object Detection | SimBEV | mASE | 0.134 | BEVFusion-L |
| 3D Object Detection | SimBEV | mATE | 0.144 | BEVFusion-L |
| 3D Object Detection | SimBEV | mAVE | 1.56 | BEVFusion-L |
| 3D Object Detection | SimBEV | SDS | 0.251 | BEVFusion-C |
| 3D Object Detection | SimBEV | mAOE | 1.044 | BEVFusion-C |
| 3D Object Detection | SimBEV | mAP | 0.221 | BEVFusion-C |
| 3D Object Detection | SimBEV | mASE | 0.137 | BEVFusion-C |
| 3D Object Detection | SimBEV | mATE | 0.744 | BEVFusion-C |
| 3D Object Detection | SimBEV | mAVE | 4.65 | BEVFusion-C |
| 2D Classification | SimBEV | SDS | 0.622 | UniTR+LSS |
| 2D Classification | SimBEV | mAOE | 0.207 | UniTR+LSS |
| 2D Classification | SimBEV | mAP | 0.478 | UniTR+LSS |
| 2D Classification | SimBEV | mASE | 0.085 | UniTR+LSS |
| 2D Classification | SimBEV | mATE | 0.113 | UniTR+LSS |
| 2D Classification | SimBEV | mAVE | 0.53 | UniTR+LSS |
| 2D Classification | SimBEV | SDS | 0.617 | UniTR |
| 2D Classification | SimBEV | mAOE | 0.224 | UniTR |
| 2D Classification | SimBEV | mAP | 0.477 | UniTR |
| 2D Classification | SimBEV | mASE | 0.09 | UniTR |
| 2D Classification | SimBEV | mATE | 0.113 | UniTR |
| 2D Classification | SimBEV | mAVE | 0.55 | UniTR |
| 2D Classification | SimBEV | SDS | 0.566 | BEVFusion |
| 2D Classification | SimBEV | mAOE | 0.122 | BEVFusion |
| 2D Classification | SimBEV | mAP | 0.481 | BEVFusion |
| 2D Classification | SimBEV | mASE | 0.127 | BEVFusion |
| 2D Classification | SimBEV | mATE | 0.146 | BEVFusion |
| 2D Classification | SimBEV | mAVE | 1.54 | BEVFusion |
| 2D Classification | SimBEV | SDS | 0.564 | BEVFusion-L |
| 2D Classification | SimBEV | mAOE | 0.133 | BEVFusion-L |
| 2D Classification | SimBEV | mAP | 0.481 | BEVFusion-L |
| 2D Classification | SimBEV | mASE | 0.134 | BEVFusion-L |
| 2D Classification | SimBEV | mATE | 0.144 | BEVFusion-L |
| 2D Classification | SimBEV | mAVE | 1.56 | BEVFusion-L |
| 2D Classification | SimBEV | SDS | 0.251 | BEVFusion-C |
| 2D Classification | SimBEV | mAOE | 1.044 | BEVFusion-C |
| 2D Classification | SimBEV | mAP | 0.221 | BEVFusion-C |
| 2D Classification | SimBEV | mASE | 0.137 | BEVFusion-C |
| 2D Classification | SimBEV | mATE | 0.744 | BEVFusion-C |
| 2D Classification | SimBEV | mAVE | 4.65 | BEVFusion-C |
| 2D Object Detection | SimBEV | SDS | 0.622 | UniTR+LSS |
| 2D Object Detection | SimBEV | mAOE | 0.207 | UniTR+LSS |
| 2D Object Detection | SimBEV | mAP | 0.478 | UniTR+LSS |
| 2D Object Detection | SimBEV | mASE | 0.085 | UniTR+LSS |
| 2D Object Detection | SimBEV | mATE | 0.113 | UniTR+LSS |
| 2D Object Detection | SimBEV | mAVE | 0.53 | UniTR+LSS |
| 2D Object Detection | SimBEV | SDS | 0.617 | UniTR |
| 2D Object Detection | SimBEV | mAOE | 0.224 | UniTR |
| 2D Object Detection | SimBEV | mAP | 0.477 | UniTR |
| 2D Object Detection | SimBEV | mASE | 0.09 | UniTR |
| 2D Object Detection | SimBEV | mATE | 0.113 | UniTR |
| 2D Object Detection | SimBEV | mAVE | 0.55 | UniTR |
| 2D Object Detection | SimBEV | SDS | 0.566 | BEVFusion |
| 2D Object Detection | SimBEV | mAOE | 0.122 | BEVFusion |
| 2D Object Detection | SimBEV | mAP | 0.481 | BEVFusion |
| 2D Object Detection | SimBEV | mASE | 0.127 | BEVFusion |
| 2D Object Detection | SimBEV | mATE | 0.146 | BEVFusion |
| 2D Object Detection | SimBEV | mAVE | 1.54 | BEVFusion |
| 2D Object Detection | SimBEV | SDS | 0.564 | BEVFusion-L |
| 2D Object Detection | SimBEV | mAOE | 0.133 | BEVFusion-L |
| 2D Object Detection | SimBEV | mAP | 0.481 | BEVFusion-L |
| 2D Object Detection | SimBEV | mASE | 0.134 | BEVFusion-L |
| 2D Object Detection | SimBEV | mATE | 0.144 | BEVFusion-L |
| 2D Object Detection | SimBEV | mAVE | 1.56 | BEVFusion-L |
| 2D Object Detection | SimBEV | SDS | 0.251 | BEVFusion-C |
| 2D Object Detection | SimBEV | mAOE | 1.044 | BEVFusion-C |
| 2D Object Detection | SimBEV | mAP | 0.221 | BEVFusion-C |
| 2D Object Detection | SimBEV | mASE | 0.137 | BEVFusion-C |
| 2D Object Detection | SimBEV | mATE | 0.744 | BEVFusion-C |
| 2D Object Detection | SimBEV | mAVE | 4.65 | BEVFusion-C |
| 10-shot image generation | SimBEV | bicycle | 0.036 | BEVFusion |
| 10-shot image generation | SimBEV | bus | 0.808 | BEVFusion |
| 10-shot image generation | SimBEV | car | 0.727 | BEVFusion |
| 10-shot image generation | SimBEV | mIoU | 0.5 | BEVFusion |
| 10-shot image generation | SimBEV | motorcycle | 0.363 | BEVFusion |
| 10-shot image generation | SimBEV | pedestrian | 0.202 | BEVFusion |
| 10-shot image generation | SimBEV | rider | 0.233 | BEVFusion |
| 10-shot image generation | SimBEV | road | 0.884 | BEVFusion |
| 10-shot image generation | SimBEV | truck | 0.745 | BEVFusion |
| 10-shot image generation | SimBEV | bicycle | 0.114 | UniTR |
| 10-shot image generation | SimBEV | bus | 0.517 | UniTR |
| 10-shot image generation | SimBEV | car | 0.738 | UniTR |
| 10-shot image generation | SimBEV | mIoU | 0.497 | UniTR |
| 10-shot image generation | SimBEV | motorcycle | 0.365 | UniTR |
| 10-shot image generation | SimBEV | pedestrian | 0.275 | UniTR |
| 10-shot image generation | SimBEV | rider | 0.362 | UniTR |
| 10-shot image generation | SimBEV | road | 0.928 | UniTR |
| 10-shot image generation | SimBEV | truck | 0.677 | UniTR |
| 10-shot image generation | SimBEV | bicycle | 0.036 | BEVFusion-L |
| 10-shot image generation | SimBEV | bus | 0.815 | BEVFusion-L |
| 10-shot image generation | SimBEV | car | 0.706 | BEVFusion-L |
| 10-shot image generation | SimBEV | mIoU | 0.483 | BEVFusion-L |
| 10-shot image generation | SimBEV | motorcycle | 0.325 | BEVFusion-L |
| 10-shot image generation | SimBEV | pedestrian | 0.189 | BEVFusion-L |
| 10-shot image generation | SimBEV | rider | 0.184 | BEVFusion-L |
| 10-shot image generation | SimBEV | road | 0.877 | BEVFusion-L |
| 10-shot image generation | SimBEV | truck | 0.735 | BEVFusion-L |
| 10-shot image generation | SimBEV | bicycle | 0.063 | UniTR+LSS |
| 10-shot image generation | SimBEV | bus | 0.585 | UniTR+LSS |
| 10-shot image generation | SimBEV | car | 0.728 | UniTR+LSS |
| 10-shot image generation | SimBEV | mIoU | 0.476 | UniTR+LSS |
| 10-shot image generation | SimBEV | motorcycle | 0.359 | UniTR+LSS |
| 10-shot image generation | SimBEV | pedestrian | 0.129 | UniTR+LSS |
| 10-shot image generation | SimBEV | rider | 0.316 | UniTR+LSS |
| 10-shot image generation | SimBEV | road | 0.933 | UniTR+LSS |
| 10-shot image generation | SimBEV | truck | 0.694 | UniTR+LSS |
| 10-shot image generation | SimBEV | bus | 0.229 | BEVFusion-C |
| 10-shot image generation | SimBEV | car | 0.172 | BEVFusion-C |
| 10-shot image generation | SimBEV | mIoU | 0.152 | BEVFusion-C |
| 10-shot image generation | SimBEV | road | 0.76 | BEVFusion-C |
| 10-shot image generation | SimBEV | truck | 0.051 | BEVFusion-C |
| Bird's-Eye View Semantic Segmentation | SimBEV | bicycle | 0.036 | BEVFusion |
| Bird's-Eye View Semantic Segmentation | SimBEV | bus | 0.808 | BEVFusion |
| Bird's-Eye View Semantic Segmentation | SimBEV | car | 0.727 | BEVFusion |
| Bird's-Eye View Semantic Segmentation | SimBEV | mIoU | 0.5 | BEVFusion |
| Bird's-Eye View Semantic Segmentation | SimBEV | motorcycle | 0.363 | BEVFusion |
| Bird's-Eye View Semantic Segmentation | SimBEV | pedestrian | 0.202 | BEVFusion |
| Bird's-Eye View Semantic Segmentation | SimBEV | rider | 0.233 | BEVFusion |
| Bird's-Eye View Semantic Segmentation | SimBEV | road | 0.884 | BEVFusion |
| Bird's-Eye View Semantic Segmentation | SimBEV | truck | 0.745 | BEVFusion |
| Bird's-Eye View Semantic Segmentation | SimBEV | bicycle | 0.114 | UniTR |
| Bird's-Eye View Semantic Segmentation | SimBEV | bus | 0.517 | UniTR |
| Bird's-Eye View Semantic Segmentation | SimBEV | car | 0.738 | UniTR |
| Bird's-Eye View Semantic Segmentation | SimBEV | mIoU | 0.497 | UniTR |
| Bird's-Eye View Semantic Segmentation | SimBEV | motorcycle | 0.365 | UniTR |
| Bird's-Eye View Semantic Segmentation | SimBEV | pedestrian | 0.275 | UniTR |
| Bird's-Eye View Semantic Segmentation | SimBEV | rider | 0.362 | UniTR |
| Bird's-Eye View Semantic Segmentation | SimBEV | road | 0.928 | UniTR |
| Bird's-Eye View Semantic Segmentation | SimBEV | truck | 0.677 | UniTR |
| Bird's-Eye View Semantic Segmentation | SimBEV | bicycle | 0.036 | BEVFusion-L |
| Bird's-Eye View Semantic Segmentation | SimBEV | bus | 0.815 | BEVFusion-L |
| Bird's-Eye View Semantic Segmentation | SimBEV | car | 0.706 | BEVFusion-L |
| Bird's-Eye View Semantic Segmentation | SimBEV | mIoU | 0.483 | BEVFusion-L |
| Bird's-Eye View Semantic Segmentation | SimBEV | motorcycle | 0.325 | BEVFusion-L |
| Bird's-Eye View Semantic Segmentation | SimBEV | pedestrian | 0.189 | BEVFusion-L |
| Bird's-Eye View Semantic Segmentation | SimBEV | rider | 0.184 | BEVFusion-L |
| Bird's-Eye View Semantic Segmentation | SimBEV | road | 0.877 | BEVFusion-L |
| Bird's-Eye View Semantic Segmentation | SimBEV | truck | 0.735 | BEVFusion-L |
| Bird's-Eye View Semantic Segmentation | SimBEV | bicycle | 0.063 | UniTR+LSS |
| Bird's-Eye View Semantic Segmentation | SimBEV | bus | 0.585 | UniTR+LSS |
| Bird's-Eye View Semantic Segmentation | SimBEV | car | 0.728 | UniTR+LSS |
| Bird's-Eye View Semantic Segmentation | SimBEV | mIoU | 0.476 | UniTR+LSS |
| Bird's-Eye View Semantic Segmentation | SimBEV | motorcycle | 0.359 | UniTR+LSS |
| Bird's-Eye View Semantic Segmentation | SimBEV | pedestrian | 0.129 | UniTR+LSS |
| Bird's-Eye View Semantic Segmentation | SimBEV | rider | 0.316 | UniTR+LSS |
| Bird's-Eye View Semantic Segmentation | SimBEV | road | 0.933 | UniTR+LSS |
| Bird's-Eye View Semantic Segmentation | SimBEV | truck | 0.694 | UniTR+LSS |
| Bird's-Eye View Semantic Segmentation | SimBEV | bus | 0.229 | BEVFusion-C |
| Bird's-Eye View Semantic Segmentation | SimBEV | car | 0.172 | BEVFusion-C |
| Bird's-Eye View Semantic Segmentation | SimBEV | mIoU | 0.152 | BEVFusion-C |
| Bird's-Eye View Semantic Segmentation | SimBEV | road | 0.76 | BEVFusion-C |
| Bird's-Eye View Semantic Segmentation | SimBEV | truck | 0.051 | BEVFusion-C |
| 16k | SimBEV | SDS | 0.622 | UniTR+LSS |
| 16k | SimBEV | mAOE | 0.207 | UniTR+LSS |
| 16k | SimBEV | mAP | 0.478 | UniTR+LSS |
| 16k | SimBEV | mASE | 0.085 | UniTR+LSS |
| 16k | SimBEV | mATE | 0.113 | UniTR+LSS |
| 16k | SimBEV | mAVE | 0.53 | UniTR+LSS |
| 16k | SimBEV | SDS | 0.617 | UniTR |
| 16k | SimBEV | mAOE | 0.224 | UniTR |
| 16k | SimBEV | mAP | 0.477 | UniTR |
| 16k | SimBEV | mASE | 0.09 | UniTR |
| 16k | SimBEV | mATE | 0.113 | UniTR |
| 16k | SimBEV | mAVE | 0.55 | UniTR |
| 16k | SimBEV | SDS | 0.566 | BEVFusion |
| 16k | SimBEV | mAOE | 0.122 | BEVFusion |
| 16k | SimBEV | mAP | 0.481 | BEVFusion |
| 16k | SimBEV | mASE | 0.127 | BEVFusion |
| 16k | SimBEV | mATE | 0.146 | BEVFusion |
| 16k | SimBEV | mAVE | 1.54 | BEVFusion |
| 16k | SimBEV | SDS | 0.564 | BEVFusion-L |
| 16k | SimBEV | mAOE | 0.133 | BEVFusion-L |
| 16k | SimBEV | mAP | 0.481 | BEVFusion-L |
| 16k | SimBEV | mASE | 0.134 | BEVFusion-L |
| 16k | SimBEV | mATE | 0.144 | BEVFusion-L |
| 16k | SimBEV | mAVE | 1.56 | BEVFusion-L |
| 16k | SimBEV | SDS | 0.251 | BEVFusion-C |
| 16k | SimBEV | mAOE | 1.044 | BEVFusion-C |
| 16k | SimBEV | mAP | 0.221 | BEVFusion-C |
| 16k | SimBEV | mASE | 0.137 | BEVFusion-C |
| 16k | SimBEV | mATE | 0.744 | BEVFusion-C |
| 16k | SimBEV | mAVE | 4.65 | BEVFusion-C |
| BEV Segmentation | SimBEV | bicycle | 0.036 | BEVFusion |
| BEV Segmentation | SimBEV | bus | 0.8 | BEVFusion |
| BEV Segmentation | SimBEV | car | 0.727 | BEVFusion |
| BEV Segmentation | SimBEV | mIoU | 0.5 | BEVFusion |
| BEV Segmentation | SimBEV | motorcycle | 0.363 | BEVFusion |
| BEV Segmentation | SimBEV | pedestrian | 0.2 | BEVFusion |
| BEV Segmentation | SimBEV | rider | 0.233 | BEVFusion |
| BEV Segmentation | SimBEV | road | 0.884 | BEVFusion |
| BEV Segmentation | SimBEV | truck | 0.745 | BEVFusion |
| BEV Segmentation | SimBEV | bicycle | 0.114 | UniTR |
| BEV Segmentation | SimBEV | bus | 0.517 | UniTR |
| BEV Segmentation | SimBEV | car | 0.738 | UniTR |
| BEV Segmentation | SimBEV | mIoU | 0.497 | UniTR |
| BEV Segmentation | SimBEV | motorcycle | 0.365 | UniTR |
| BEV Segmentation | SimBEV | pedestrian | 0.275 | UniTR |
| BEV Segmentation | SimBEV | rider | 0.362 | UniTR |
| BEV Segmentation | SimBEV | road | 0.928 | UniTR |
| BEV Segmentation | SimBEV | truck | 0.677 | UniTR |
| BEV Segmentation | SimBEV | bicycle | 0.036 | BEVFusion-L |
| BEV Segmentation | SimBEV | bus | 0.815 | BEVFusion-L |
| BEV Segmentation | SimBEV | car | 0.706 | BEVFusion-L |
| BEV Segmentation | SimBEV | mIoU | 0.483 | BEVFusion-L |
| BEV Segmentation | SimBEV | motorcycle | 0.325 | BEVFusion-L |
| BEV Segmentation | SimBEV | pedestrian | 0.189 | BEVFusion-L |
| BEV Segmentation | SimBEV | rider | 0.184 | BEVFusion-L |
| BEV Segmentation | SimBEV | road | 0.877 | BEVFusion-L |
| BEV Segmentation | SimBEV | truck | 0.735 | BEVFusion-L |
| BEV Segmentation | SimBEV | bicycle | 0.063 | UniTR+LSS |
| BEV Segmentation | SimBEV | bus | 0.585 | UniTR+LSS |
| BEV Segmentation | SimBEV | car | 0.728 | UniTR+LSS |
| BEV Segmentation | SimBEV | mIoU | 0.476 | UniTR+LSS |
| BEV Segmentation | SimBEV | motorcycle | 0.359 | UniTR+LSS |
| BEV Segmentation | SimBEV | pedestrian | 0.129 | UniTR+LSS |
| BEV Segmentation | SimBEV | rider | 0.316 | UniTR+LSS |
| BEV Segmentation | SimBEV | road | 0.933 | UniTR+LSS |
| BEV Segmentation | SimBEV | truck | 0.694 | UniTR+LSS |
| BEV Segmentation | SimBEV | bus | 0.229 | BEVFusion-C |
| BEV Segmentation | SimBEV | car | 0.172 | BEVFusion-C |
| BEV Segmentation | SimBEV | mIoU | 0.152 | BEVFusion-C |
| BEV Segmentation | SimBEV | road | 0.76 | BEVFusion-C |
| BEV Segmentation | SimBEV | truck | 0.051 | BEVFusion-C |