TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Rethinking RGB-D Salient Object Detection: Models, Data Se...

Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks

Deng-Ping Fan, Zheng Lin, Jia-Xing Zhao, Yun Liu, Zhao Zhang, Qibin Hou, Menglong Zhu, Ming-Ming Cheng

2019-07-15Salient Object DetectionRGB-D Salient Object Detectionobject-detectionObject DetectionRGB Salient Object Detection
PaperPDFCodeCode(official)

Abstract

The use of RGB-D information for salient object detection has been extensively explored in recent years. However, relatively few efforts have been put towards modeling salient object detection in real-world human activity scenes with RGBD. In this work, we fill the gap by making the following contributions to RGB-D salient object detection. (1) We carefully collect a new SIP (salient person) dataset, which consists of ~1K high-resolution images that cover diverse real-world scenes from various viewpoints, poses, occlusions, illuminations, and backgrounds. (2) We conduct a large-scale (and, so far, the most comprehensive) benchmark comparing contemporary methods, which has long been missing in the field and can serve as a baseline for future research. We systematically summarize 32 popular models and evaluate 18 parts of 32 models on seven datasets containing a total of about 97K images. (3) We propose a simple general architecture, called Deep Depth-Depurator Network (D3Net). It consists of a depth depurator unit (DDU) and a three-stream feature learning module (FLM), which performs low-quality depth map filtering and cross-modal feature learning respectively. These components form a nested structure and are elaborately designed to be learned jointly. D3Net exceeds the performance of any prior contenders across all five metrics under consideration, thus serving as a strong model to advance research in this field. We also demonstrate that D3Net can be used to efficiently extract salient object masks from real scenes, enabling effective background changing application with a speed of 65fps on a single GPU. All the saliency maps, our new SIP dataset, the D3Net model, and the evaluation tools are publicly available at https://github.com/DengPingFan/D3NetBenchmark.

Results

TaskDatasetMetricValueModel
Object DetectionNJU2KAverage MAE0.046D3Net
Object DetectionNJU2KS-Measure90D3Net
Object DetectionNJU2Kmax E-Measure93.9D3Net
Object DetectionNJU2Kmax F-Measure90D3Net
Object DetectionSTEREAverage MAE0.046D3Net
Object DetectionSTERES-Measure89.9D3Net
Object DetectionSTEREmax E-Measure93.8D3Net
Object DetectionSTEREmax F-Measure89.1D3Net
Object DetectionLFSDAverage MAE0.095D3Net
Object DetectionLFSDS-Measure82.5D3Net
Object DetectionLFSDmax E-Measure86.2D3Net
Object DetectionLFSDmax F-Measure81D3Net
Object DetectionSIPAverage MAE0.063D3Net
Object DetectionSIPS-Measure86D3Net
Object DetectionSIPmax E-Measure90.9D3Net
Object DetectionSIPmax F-Measure86.1D3Net
Object DetectionRGBD135Average MAE0.058D3Net
Object DetectionRGBD135S-Measure85.7D3Net
Object DetectionRGBD135max E-Measure91D3Net
Object DetectionRGBD135max F-Measure83.4D3Net
Object DetectionNLPRAverage MAE0.03D3Net
Object DetectionNLPRS-Measure91.2D3Net
Object DetectionNLPRmax E-Measure95.3D3Net
Object DetectionNLPRmax F-Measure89.7D3Net
3DNJU2KAverage MAE0.046D3Net
3DNJU2KS-Measure90D3Net
3DNJU2Kmax E-Measure93.9D3Net
3DNJU2Kmax F-Measure90D3Net
3DSTEREAverage MAE0.046D3Net
3DSTERES-Measure89.9D3Net
3DSTEREmax E-Measure93.8D3Net
3DSTEREmax F-Measure89.1D3Net
3DLFSDAverage MAE0.095D3Net
3DLFSDS-Measure82.5D3Net
3DLFSDmax E-Measure86.2D3Net
3DLFSDmax F-Measure81D3Net
3DSIPAverage MAE0.063D3Net
3DSIPS-Measure86D3Net
3DSIPmax E-Measure90.9D3Net
3DSIPmax F-Measure86.1D3Net
3DRGBD135Average MAE0.058D3Net
3DRGBD135S-Measure85.7D3Net
3DRGBD135max E-Measure91D3Net
3DRGBD135max F-Measure83.4D3Net
3DNLPRAverage MAE0.03D3Net
3DNLPRS-Measure91.2D3Net
3DNLPRmax E-Measure95.3D3Net
3DNLPRmax F-Measure89.7D3Net
2D ClassificationNJU2KAverage MAE0.046D3Net
2D ClassificationNJU2KS-Measure90D3Net
2D ClassificationNJU2Kmax E-Measure93.9D3Net
2D ClassificationNJU2Kmax F-Measure90D3Net
2D ClassificationSTEREAverage MAE0.046D3Net
2D ClassificationSTERES-Measure89.9D3Net
2D ClassificationSTEREmax E-Measure93.8D3Net
2D ClassificationSTEREmax F-Measure89.1D3Net
2D ClassificationLFSDAverage MAE0.095D3Net
2D ClassificationLFSDS-Measure82.5D3Net
2D ClassificationLFSDmax E-Measure86.2D3Net
2D ClassificationLFSDmax F-Measure81D3Net
2D ClassificationSIPAverage MAE0.063D3Net
2D ClassificationSIPS-Measure86D3Net
2D ClassificationSIPmax E-Measure90.9D3Net
2D ClassificationSIPmax F-Measure86.1D3Net
2D ClassificationRGBD135Average MAE0.058D3Net
2D ClassificationRGBD135S-Measure85.7D3Net
2D ClassificationRGBD135max E-Measure91D3Net
2D ClassificationRGBD135max F-Measure83.4D3Net
2D ClassificationNLPRAverage MAE0.03D3Net
2D ClassificationNLPRS-Measure91.2D3Net
2D ClassificationNLPRmax E-Measure95.3D3Net
2D ClassificationNLPRmax F-Measure89.7D3Net
2D Object DetectionNJU2KAverage MAE0.046D3Net
2D Object DetectionNJU2KS-Measure90D3Net
2D Object DetectionNJU2Kmax E-Measure93.9D3Net
2D Object DetectionNJU2Kmax F-Measure90D3Net
2D Object DetectionSTEREAverage MAE0.046D3Net
2D Object DetectionSTERES-Measure89.9D3Net
2D Object DetectionSTEREmax E-Measure93.8D3Net
2D Object DetectionSTEREmax F-Measure89.1D3Net
2D Object DetectionLFSDAverage MAE0.095D3Net
2D Object DetectionLFSDS-Measure82.5D3Net
2D Object DetectionLFSDmax E-Measure86.2D3Net
2D Object DetectionLFSDmax F-Measure81D3Net
2D Object DetectionSIPAverage MAE0.063D3Net
2D Object DetectionSIPS-Measure86D3Net
2D Object DetectionSIPmax E-Measure90.9D3Net
2D Object DetectionSIPmax F-Measure86.1D3Net
2D Object DetectionRGBD135Average MAE0.058D3Net
2D Object DetectionRGBD135S-Measure85.7D3Net
2D Object DetectionRGBD135max E-Measure91D3Net
2D Object DetectionRGBD135max F-Measure83.4D3Net
2D Object DetectionNLPRAverage MAE0.03D3Net
2D Object DetectionNLPRS-Measure91.2D3Net
2D Object DetectionNLPRmax E-Measure95.3D3Net
2D Object DetectionNLPRmax F-Measure89.7D3Net
16kNJU2KAverage MAE0.046D3Net
16kNJU2KS-Measure90D3Net
16kNJU2Kmax E-Measure93.9D3Net
16kNJU2Kmax F-Measure90D3Net
16kSTEREAverage MAE0.046D3Net
16kSTERES-Measure89.9D3Net
16kSTEREmax E-Measure93.8D3Net
16kSTEREmax F-Measure89.1D3Net
16kLFSDAverage MAE0.095D3Net
16kLFSDS-Measure82.5D3Net
16kLFSDmax E-Measure86.2D3Net
16kLFSDmax F-Measure81D3Net
16kSIPAverage MAE0.063D3Net
16kSIPS-Measure86D3Net
16kSIPmax E-Measure90.9D3Net
16kSIPmax F-Measure86.1D3Net
16kRGBD135Average MAE0.058D3Net
16kRGBD135S-Measure85.7D3Net
16kRGBD135max E-Measure91D3Net
16kRGBD135max F-Measure83.4D3Net
16kNLPRAverage MAE0.03D3Net
16kNLPRS-Measure91.2D3Net
16kNLPRmax E-Measure95.3D3Net
16kNLPRmax F-Measure89.7D3Net

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations2025-07-07