Qi Chen
Autonomous vehicles are heavily reliant upon their sensors to perfect the perception of surrounding environments, however, with the current state of technology, the data which a vehicle uses is confined to that from its own sensors. Data sharing between vehicles and/or edge servers is limited by the available network bandwidth and the stringent real-time constraints of autonomous driving applications. To address these issues, we propose a point cloud feature based cooperative perception framework (F-Cooper) for connected autonomous vehicles to achieve a better object detection precision. Not only will feature based data be sufficient for the training process, we also use the features' intrinsically small size to achieve real-time edge computing, without running the risk of congesting the network. Our experiment results show that by fusing features, we are able to achieve a better object detection result, around 10% improvement for detection within 20 meters and 30% for further distances, as well as achieve faster edge computing with a low communication delay, requiring 71 milliseconds in certain feature selections. To the best of our knowledge, we are the first to introduce feature-level data fusion to connected autonomous vehicles for the purpose of enhancing object detection and making real-time edge computing on inter-vehicle data feasible for autonomous vehicles.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Object Detection | OPV2V | AP@0.7@CulverCity | 0.728 | F-Cooper (PointPillar backbone) |
| Object Detection | OPV2V | AP@0.7@Default | 0.79 | F-Cooper (PointPillar backbone) |
| Object Detection | V2XSet | AP0.5 (Noisy) | 0.715 | F-Cooper |
| Object Detection | V2XSet | AP0.5 (Perfect) | 0.84 | F-Cooper |
| Object Detection | V2XSet | AP0.7 (Noisy) | 0.469 | F-Cooper |
| Object Detection | V2XSet | AP0.7 (Perfect) | 0.68 | F-Cooper |
| 3D | OPV2V | AP@0.7@CulverCity | 0.728 | F-Cooper (PointPillar backbone) |
| 3D | OPV2V | AP@0.7@Default | 0.79 | F-Cooper (PointPillar backbone) |
| 3D | V2XSet | AP0.5 (Noisy) | 0.715 | F-Cooper |
| 3D | V2XSet | AP0.5 (Perfect) | 0.84 | F-Cooper |
| 3D | V2XSet | AP0.7 (Noisy) | 0.469 | F-Cooper |
| 3D | V2XSet | AP0.7 (Perfect) | 0.68 | F-Cooper |
| 3D Object Detection | OPV2V | AP@0.7@CulverCity | 0.728 | F-Cooper (PointPillar backbone) |
| 3D Object Detection | OPV2V | AP@0.7@Default | 0.79 | F-Cooper (PointPillar backbone) |
| 3D Object Detection | V2XSet | AP0.5 (Noisy) | 0.715 | F-Cooper |
| 3D Object Detection | V2XSet | AP0.5 (Perfect) | 0.84 | F-Cooper |
| 3D Object Detection | V2XSet | AP0.7 (Noisy) | 0.469 | F-Cooper |
| 3D Object Detection | V2XSet | AP0.7 (Perfect) | 0.68 | F-Cooper |
| 2D Classification | OPV2V | AP@0.7@CulverCity | 0.728 | F-Cooper (PointPillar backbone) |
| 2D Classification | OPV2V | AP@0.7@Default | 0.79 | F-Cooper (PointPillar backbone) |
| 2D Classification | V2XSet | AP0.5 (Noisy) | 0.715 | F-Cooper |
| 2D Classification | V2XSet | AP0.5 (Perfect) | 0.84 | F-Cooper |
| 2D Classification | V2XSet | AP0.7 (Noisy) | 0.469 | F-Cooper |
| 2D Classification | V2XSet | AP0.7 (Perfect) | 0.68 | F-Cooper |
| 2D Object Detection | OPV2V | AP@0.7@CulverCity | 0.728 | F-Cooper (PointPillar backbone) |
| 2D Object Detection | OPV2V | AP@0.7@Default | 0.79 | F-Cooper (PointPillar backbone) |
| 2D Object Detection | V2XSet | AP0.5 (Noisy) | 0.715 | F-Cooper |
| 2D Object Detection | V2XSet | AP0.5 (Perfect) | 0.84 | F-Cooper |
| 2D Object Detection | V2XSet | AP0.7 (Noisy) | 0.469 | F-Cooper |
| 2D Object Detection | V2XSet | AP0.7 (Perfect) | 0.68 | F-Cooper |
| 16k | OPV2V | AP@0.7@CulverCity | 0.728 | F-Cooper (PointPillar backbone) |
| 16k | OPV2V | AP@0.7@Default | 0.79 | F-Cooper (PointPillar backbone) |
| 16k | V2XSet | AP0.5 (Noisy) | 0.715 | F-Cooper |
| 16k | V2XSet | AP0.5 (Perfect) | 0.84 | F-Cooper |
| 16k | V2XSet | AP0.7 (Noisy) | 0.469 | F-Cooper |
| 16k | V2XSet | AP0.7 (Perfect) | 0.68 | F-Cooper |