Damien Robert, Hugo Raguet, Loic Landrieu
We introduce a highly efficient method for panoptic segmentation of large 3D point clouds by redefining this task as a scalable graph clustering problem. This approach can be trained using only local auxiliary tasks, thereby eliminating the resource-intensive instance-matching step during training. Moreover, our formulation can easily be adapted to the superpoint paradigm, further increasing its efficiency. This allows our model to process scenes with millions of points and thousands of objects in a single inference. Our method, called SuperCluster, achieves a new state-of-the-art panoptic segmentation performance for two indoor scanning datasets: $50.1$ PQ ($+7.8$) for S3DIS Area~5, and $58.7$ PQ ($+25.2$) for ScanNetV2. We also set the first state-of-the-art for two large-scale mobile mapping benchmarks: KITTI-360 and DALES. With only $209$k parameters, our model is over $30$ times smaller than the best-competing method and trains up to $15$ times faster. Our code and pretrained models are available at https://github.com/drprojects/superpoint_transformer.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | S3DIS Area5 | Number of params | 0.21 | SuperCluster |
| Semantic Segmentation | S3DIS Area5 | mIoU | 68.1 | SuperCluster |
| Semantic Segmentation | S3DIS | Mean IoU | 75.3 | SuperCluster |
| Semantic Segmentation | S3DIS | Params (M) | 0.21 | SuperCluster |
| Semantic Segmentation | ScanNet | PQ | 58.7 | SuperCluster |
| Semantic Segmentation | ScanNet | PQ_st | 84.1 | SuperCluster |
| Semantic Segmentation | ScanNet | PQ_th | 69.1 | SuperCluster |
| Semantic Segmentation | ScanNetV2 | PQ | 58.7 | SuperCluster |
| Semantic Segmentation | ScanNetV2 | Params (M) | 1 | SuperCluster |
| Semantic Segmentation | ScanNetV2 | RQ | 69.1 | SuperCluster |
| Semantic Segmentation | ScanNetV2 | SQ | 84.1 | SuperCluster |
| Semantic Segmentation | KITTI-360 | PQ | 48.3 | SuperCluster |
| Semantic Segmentation | KITTI-360 | Params (M) | 0.79 | SuperCluster |
| Semantic Segmentation | KITTI-360 | RQ | 58.4 | SuperCluster |
| Semantic Segmentation | KITTI-360 | SQ | 75.1 | SuperCluster |
| Semantic Segmentation | DALES | PQ | 61.2 | SuperCluster |
| Semantic Segmentation | DALES | Params (M) | 0.21 | SuperCluster |
| Semantic Segmentation | DALES | RQ | 68.6 | SuperCluster |
| Semantic Segmentation | DALES | SQ | 87.1 | SuperCluster |
| Semantic Segmentation | S3DIS Area5 | PQ | 50.1 | SuperCluster |
| Semantic Segmentation | S3DIS Area5 | PQ (with stuff) | 58.4 | SuperCluster |
| Semantic Segmentation | S3DIS Area5 | Params (M) | 0.21 | SuperCluster |
| Semantic Segmentation | S3DIS Area5 | RQ | 60.1 | SuperCluster |
| Semantic Segmentation | S3DIS Area5 | RQ (with stuff) | 68.4 | SuperCluster |
| Semantic Segmentation | S3DIS Area5 | SQ | 76.6 | SuperCluster |
| Semantic Segmentation | S3DIS Area5 | SQ (with stuff) | 77.8 | SuperCluster |
| Semantic Segmentation | S3DIS | PQ | 55.9 | SuperCluster |
| Semantic Segmentation | S3DIS | PQ (with stuff) | 62.7 | SuperCluster |
| Semantic Segmentation | S3DIS | Params (M) | 0.21 | SuperCluster |
| Semantic Segmentation | S3DIS | RQ | 66.3 | SuperCluster |
| Semantic Segmentation | S3DIS | RQ (with stuff) | 73.2 | SuperCluster |
| Semantic Segmentation | S3DIS | SQ | 83.8 | SuperCluster |
| Semantic Segmentation | S3DIS | SQ (with stuff) | 84.8 | SuperCluster |
| Semantic Segmentation | DALES | mIoU | 77.3 | SuperCluster |
| Semantic Segmentation | KITTI-360 | miou Val | 62.1 | SuperCluster |
| 3D Semantic Segmentation | DALES | mIoU | 77.3 | SuperCluster |
| 3D Semantic Segmentation | KITTI-360 | miou Val | 62.1 | SuperCluster |
| 10-shot image generation | S3DIS Area5 | Number of params | 0.21 | SuperCluster |
| 10-shot image generation | S3DIS Area5 | mIoU | 68.1 | SuperCluster |
| 10-shot image generation | S3DIS | Mean IoU | 75.3 | SuperCluster |
| 10-shot image generation | S3DIS | Params (M) | 0.21 | SuperCluster |
| 10-shot image generation | ScanNet | PQ | 58.7 | SuperCluster |
| 10-shot image generation | ScanNet | PQ_st | 84.1 | SuperCluster |
| 10-shot image generation | ScanNet | PQ_th | 69.1 | SuperCluster |
| 10-shot image generation | ScanNetV2 | PQ | 58.7 | SuperCluster |
| 10-shot image generation | ScanNetV2 | Params (M) | 1 | SuperCluster |
| 10-shot image generation | ScanNetV2 | RQ | 69.1 | SuperCluster |
| 10-shot image generation | ScanNetV2 | SQ | 84.1 | SuperCluster |
| 10-shot image generation | KITTI-360 | PQ | 48.3 | SuperCluster |
| 10-shot image generation | KITTI-360 | Params (M) | 0.79 | SuperCluster |
| 10-shot image generation | KITTI-360 | RQ | 58.4 | SuperCluster |
| 10-shot image generation | KITTI-360 | SQ | 75.1 | SuperCluster |
| 10-shot image generation | DALES | PQ | 61.2 | SuperCluster |
| 10-shot image generation | DALES | Params (M) | 0.21 | SuperCluster |
| 10-shot image generation | DALES | RQ | 68.6 | SuperCluster |
| 10-shot image generation | DALES | SQ | 87.1 | SuperCluster |
| 10-shot image generation | S3DIS Area5 | PQ | 50.1 | SuperCluster |
| 10-shot image generation | S3DIS Area5 | PQ (with stuff) | 58.4 | SuperCluster |
| 10-shot image generation | S3DIS Area5 | Params (M) | 0.21 | SuperCluster |
| 10-shot image generation | S3DIS Area5 | RQ | 60.1 | SuperCluster |
| 10-shot image generation | S3DIS Area5 | RQ (with stuff) | 68.4 | SuperCluster |
| 10-shot image generation | S3DIS Area5 | SQ | 76.6 | SuperCluster |
| 10-shot image generation | S3DIS Area5 | SQ (with stuff) | 77.8 | SuperCluster |
| 10-shot image generation | S3DIS | PQ | 55.9 | SuperCluster |
| 10-shot image generation | S3DIS | PQ (with stuff) | 62.7 | SuperCluster |
| 10-shot image generation | S3DIS | Params (M) | 0.21 | SuperCluster |
| 10-shot image generation | S3DIS | RQ | 66.3 | SuperCluster |
| 10-shot image generation | S3DIS | RQ (with stuff) | 73.2 | SuperCluster |
| 10-shot image generation | S3DIS | SQ | 83.8 | SuperCluster |
| 10-shot image generation | S3DIS | SQ (with stuff) | 84.8 | SuperCluster |
| 10-shot image generation | DALES | mIoU | 77.3 | SuperCluster |
| 10-shot image generation | KITTI-360 | miou Val | 62.1 | SuperCluster |
| Panoptic Segmentation | ScanNet | PQ | 58.7 | SuperCluster |
| Panoptic Segmentation | ScanNet | PQ_st | 84.1 | SuperCluster |
| Panoptic Segmentation | ScanNet | PQ_th | 69.1 | SuperCluster |
| Panoptic Segmentation | ScanNetV2 | PQ | 58.7 | SuperCluster |
| Panoptic Segmentation | ScanNetV2 | Params (M) | 1 | SuperCluster |
| Panoptic Segmentation | ScanNetV2 | RQ | 69.1 | SuperCluster |
| Panoptic Segmentation | ScanNetV2 | SQ | 84.1 | SuperCluster |
| Panoptic Segmentation | KITTI-360 | PQ | 48.3 | SuperCluster |
| Panoptic Segmentation | KITTI-360 | Params (M) | 0.79 | SuperCluster |
| Panoptic Segmentation | KITTI-360 | RQ | 58.4 | SuperCluster |
| Panoptic Segmentation | KITTI-360 | SQ | 75.1 | SuperCluster |
| Panoptic Segmentation | DALES | PQ | 61.2 | SuperCluster |
| Panoptic Segmentation | DALES | Params (M) | 0.21 | SuperCluster |
| Panoptic Segmentation | DALES | RQ | 68.6 | SuperCluster |
| Panoptic Segmentation | DALES | SQ | 87.1 | SuperCluster |
| Panoptic Segmentation | S3DIS Area5 | PQ | 50.1 | SuperCluster |
| Panoptic Segmentation | S3DIS Area5 | PQ (with stuff) | 58.4 | SuperCluster |
| Panoptic Segmentation | S3DIS Area5 | Params (M) | 0.21 | SuperCluster |
| Panoptic Segmentation | S3DIS Area5 | RQ | 60.1 | SuperCluster |
| Panoptic Segmentation | S3DIS Area5 | RQ (with stuff) | 68.4 | SuperCluster |
| Panoptic Segmentation | S3DIS Area5 | SQ | 76.6 | SuperCluster |
| Panoptic Segmentation | S3DIS Area5 | SQ (with stuff) | 77.8 | SuperCluster |
| Panoptic Segmentation | S3DIS | PQ | 55.9 | SuperCluster |
| Panoptic Segmentation | S3DIS | PQ (with stuff) | 62.7 | SuperCluster |
| Panoptic Segmentation | S3DIS | Params (M) | 0.21 | SuperCluster |
| Panoptic Segmentation | S3DIS | RQ | 66.3 | SuperCluster |
| Panoptic Segmentation | S3DIS | RQ (with stuff) | 73.2 | SuperCluster |
| Panoptic Segmentation | S3DIS | SQ | 83.8 | SuperCluster |
| Panoptic Segmentation | S3DIS | SQ (with stuff) | 84.8 | SuperCluster |