TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Unified Image and Video Saliency Modeling

Unified Image and Video Saliency Modeling

Richard Droste, Jianbo Jiao, J. Alison Noble

2020-03-11ECCV 2020 8Saliency PredictionVideo Saliency DetectionDomain Adaptation
PaperPDFCode(official)Code

Abstract

Visual saliency modeling for images and videos is treated as two independent tasks in recent computer vision literature. While image saliency modeling is a well-studied problem and progress on benchmarks like SALICON and MIT300 is slowing, video saliency models have shown rapid gains on the recent DHF1K benchmark. Here, we take a step back and ask: Can image and video saliency modeling be approached via a unified model, with mutual benefit? We identify different sources of domain shift between image and video saliency data and between different video saliency datasets as a key challenge for effective joint modelling. To address this we propose four novel domain adaptation techniques - Domain-Adaptive Priors, Domain-Adaptive Fusion, Domain-Adaptive Smoothing and Bypass-RNN - in addition to an improved formulation of learned Gaussian priors. We integrate these techniques into a simple and lightweight encoder-RNN-decoder-style network, UNISAL, and train it jointly with image and video saliency data. We evaluate our method on the video saliency datasets DHF1K, Hollywood-2 and UCF-Sports, and the image saliency datasets SALICON and MIT300. With one set of parameters, UNISAL achieves state-of-the-art performance on all video saliency datasets and is on par with the state-of-the-art for image saliency datasets, despite faster runtime and a 5 to 20-fold smaller model size compared to all competing deep methods. We provide retrospective analyses and ablation studies which confirm the importance of the domain shift modeling. The code is available at https://github.com/rdroste/unisal

Results

TaskDatasetMetricValueModel
Saliency DetectionMSU Video Saliency PredictionAUC-J0.858UNISAL (videos)
Saliency DetectionMSU Video Saliency PredictionCC0.707UNISAL (videos)
Saliency DetectionMSU Video Saliency PredictionFPS70.46UNISAL (videos)
Saliency DetectionMSU Video Saliency PredictionKLDiv0.536UNISAL (videos)
Saliency DetectionMSU Video Saliency PredictionNSS2.03UNISAL (videos)
Saliency DetectionMSU Video Saliency PredictionSIM0.609UNISAL (videos)

Related Papers

A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17Domain Borders Are There to Be Crossed With Federated Few-Shot Adaptation2025-07-14An Offline Mobile Conversational Agent for Mental Health Support: Learning from Emotional Dialogues and Psychological Texts with Student-Centered Evaluation2025-07-11The Bayesian Approach to Continual Learning: An Overview2025-07-11Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection2025-07-10YOLO-APD: Enhancing YOLOv8 for Robust Pedestrian Detection on Complex Road Geometries2025-07-07CORE-ReID V2: Advancing the Domain Adaptation for Object Re-Identification with Optimized Training and Ensemble Fusion2025-07-04Underwater Monocular Metric Depth Estimation: Real-World Benchmarks and Synthetic Fine-Tuning2025-07-02