TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Specificity-preserving RGB-D Saliency Detection

Specificity-preserving RGB-D Saliency Detection

Tao Zhou, Deng-Ping Fan, Geng Chen, Yi Zhou, Huazhu Fu

2021-08-18ICCV 2021 10Thermal Image SegmentationSaliency PredictionSalient Object DetectionSpecificityobject-detectionObject DetectionSaliency Detection
PaperPDFCode(official)Code(official)Code(official)

Abstract

Salient object detection (SOD) on RGB and depth images has attracted more and more research interests, due to its effectiveness and the fact that depth cues can now be conveniently captured. Existing RGB-D SOD models usually adopt different fusion strategies to learn a shared representation from the two modalities (\ie, RGB and depth), while few methods explicitly consider how to preserve modality-specific characteristics. In this study, we propose a novel framework, termed SPNet} (Specificity-preserving network), which benefits SOD performance by exploring both the shared information and modality-specific properties (\eg, specificity). Specifically, we propose to adopt two modality-specific networks and a shared learning network to generate individual and shared saliency prediction maps, respectively. To effectively fuse cross-modal features in the shared learning network, we propose a cross-enhanced integration module (CIM) and then propagate the fused feature to the next layer for integrating cross-level information. Moreover, to capture rich complementary multi-modal information for boosting the SOD performance, we propose a multi-modal feature aggregation (MFA) module to integrate the modality-specific features from each individual decoder into the shared decoder. By using a skip connection, the hierarchical features between the encoder and decoder layers can be fully combined. Extensive experiments demonstrate that our~\ours~outperforms cutting-edge approaches on six popular RGB-D SOD and three camouflaged object detection benchmarks. The project is publicly available at: https://github.com/taozh2017/SPNet.

Results

TaskDatasetMetricValueModel
Semantic SegmentationRGB-T-Glass-SegmentationMAE0.041SPNet
Object DetectionDSECmAP27.7SPNet
Object DetectionPKU-DDD17-Car mAP5084.7SPNet
3DDSECmAP27.7SPNet
3DPKU-DDD17-Car mAP5084.7SPNet
2D ClassificationDSECmAP27.7SPNet
2D ClassificationPKU-DDD17-Car mAP5084.7SPNet
Scene SegmentationRGB-T-Glass-SegmentationMAE0.041SPNet
2D Object DetectionDSECmAP27.7SPNet
2D Object DetectionPKU-DDD17-Car mAP5084.7SPNet
2D Object DetectionRGB-T-Glass-SegmentationMAE0.041SPNet
10-shot image generationRGB-T-Glass-SegmentationMAE0.041SPNet
16kDSECmAP27.7SPNet
16kPKU-DDD17-Car mAP5084.7SPNet

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15RadiomicsRetrieval: A Customizable Framework for Medical Image Retrieval Using Radiomics Features2025-07-11ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08