Duc Dang Trung Tran, Byeongkeun Kang, Yeejin Lee
Recently, transformer-based techniques incorporating superpoints have become prevalent in 3D instance segmentation. However, they often encounter an over-segmentation problem, especially noticeable with large objects. Additionally, unreliable mask predictions stemming from superpoint mask prediction further compound this issue. To address these challenges, we propose a novel framework called MSTA3D. It leverages multi-scale feature representation and introduces a twin-attention mechanism to effectively capture them. Furthermore, MSTA3D integrates a box query with a box regularizer, offering a complementary spatial constraint alongside semantic queries. Experimental evaluations on ScanNetV2, ScanNet200 and S3DIS datasets demonstrate that our approach surpasses state-of-the-art 3D instance segmentation methods.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Instance Segmentation | S3DIS | AP@50 | 70 | MSTA3D |
| Instance Segmentation | S3DIS | mPrec | 80.6 | MSTA3D |
| Instance Segmentation | S3DIS | mRec | 70.1 | MSTA3D |
| Instance Segmentation | ScanNet(v2) | mAP | 56.9 | MSTA3D |
| Instance Segmentation | ScanNet(v2) | mAP @ 50 | 79.5 | MSTA3D |
| Instance Segmentation | ScanNet(v2) | mAP@25 | 87.9 | MSTA3D |
| Instance Segmentation | ScanNet(v2) | mRec | 74.1 | MSTA3D |
| Instance Segmentation | ScanNet200 | mAP | 26.2 | MSTA3D |
| Instance Segmentation | ScanNet200 | mAP@25 | 40.1 | MSTA3D |
| Instance Segmentation | ScanNet200 | mAP@50 | 35.2 | MSTA3D |
| 3D Instance Segmentation | S3DIS | AP@50 | 70 | MSTA3D |
| 3D Instance Segmentation | S3DIS | mPrec | 80.6 | MSTA3D |
| 3D Instance Segmentation | S3DIS | mRec | 70.1 | MSTA3D |
| 3D Instance Segmentation | ScanNet(v2) | mAP | 56.9 | MSTA3D |
| 3D Instance Segmentation | ScanNet(v2) | mAP @ 50 | 79.5 | MSTA3D |
| 3D Instance Segmentation | ScanNet(v2) | mAP@25 | 87.9 | MSTA3D |
| 3D Instance Segmentation | ScanNet(v2) | mRec | 74.1 | MSTA3D |
| 3D Instance Segmentation | ScanNet200 | mAP | 26.2 | MSTA3D |
| 3D Instance Segmentation | ScanNet200 | mAP@25 | 40.1 | MSTA3D |
| 3D Instance Segmentation | ScanNet200 | mAP@50 | 35.2 | MSTA3D |