Chenglizhao Chen, Jipeng Wei, Chong Peng, Hong Qin
The existing fusion based RGB-D salient object detection methods usually adopt the bi-stream structure to strike the fusion trade-off between RGB and depth (D). The D quality usually varies from scene to scene, while the SOTA bi-stream approaches are depth quality unaware, which easily result in substantial difficulties in achieving complementary fusion status between RGB and D, leading to poor fusion results in facing of low-quality D. Thus, this paper attempts to integrate a novel depth quality aware subnet into the classic bi-stream structure, aiming to assess the depth quality before conducting the selective RGB-D fusion. Compared with the SOTA bi-stream methods, the major highlight of our method is its ability to lessen the importance of those low-quality, no-contribution, or even negative-contribution D regions during the RGB-D fusion, achieving a much improved complementary status between RGB and D.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Object Detection | NJU2K | Average MAE | 0.052 | DQSD-VGG19 |
| Object Detection | NJU2K | S-Measure | 89.7 | DQSD-VGG19 |
| 3D | NJU2K | Average MAE | 0.052 | DQSD-VGG19 |
| 3D | NJU2K | S-Measure | 89.7 | DQSD-VGG19 |
| 2D Classification | NJU2K | Average MAE | 0.052 | DQSD-VGG19 |
| 2D Classification | NJU2K | S-Measure | 89.7 | DQSD-VGG19 |
| 2D Object Detection | NJU2K | Average MAE | 0.052 | DQSD-VGG19 |
| 2D Object Detection | NJU2K | S-Measure | 89.7 | DQSD-VGG19 |
| 16k | NJU2K | Average MAE | 0.052 | DQSD-VGG19 |
| 16k | NJU2K | S-Measure | 89.7 | DQSD-VGG19 |