TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SUM: Saliency Unification through Mamba for Visual Attenti...

SUM: Saliency Unification through Mamba for Visual Attention Modeling

Alireza Hosseini, Amirhossein Kazerouni, Saeed Akhavan, Michael Brudno, Babak Taati

2024-06-25MarketingSaliency PredictionSaliency Detection
PaperPDFCode(official)

Abstract

Visual attention modeling, important for interpreting and prioritizing visual stimuli, plays a significant role in applications such as marketing, multimedia, and robotics. Traditional saliency prediction models, especially those based on Convolutional Neural Networks (CNNs) or Transformers, achieve notable success by leveraging large-scale annotated datasets. However, the current state-of-the-art (SOTA) models that use Transformers are computationally expensive. Additionally, separate models are often required for each image type, lacking a unified approach. In this paper, we propose Saliency Unification through Mamba (SUM), a novel approach that integrates the efficient long-range dependency modeling of Mamba with U-Net to provide a unified model for diverse image types. Using a novel Conditional Visual State Space (C-VSS) block, SUM dynamically adapts to various image types, including natural scenes, web pages, and commercial imagery, ensuring universal applicability across different data types. Our comprehensive evaluations across five benchmarks demonstrate that SUM seamlessly adapts to different visual characteristics and consistently outperforms existing models. These results position SUM as a versatile and powerful tool for advancing visual attention modeling, offering a robust solution universally applicable across different types of visual content.

Results

TaskDatasetMetricValueModel
Saliency DetectionCAT2000AUC0.888SUM
Saliency DetectionCAT2000NSS2.423SUM
Saliency DetectionCAT2000KL0.27SUM
Saliency DetectionMIT300AUC-Judd0.913SUM
Saliency DetectionMIT300CC0.768SUM
Saliency DetectionMIT300KLD0.563SUM
Saliency DetectionMIT300NSS2.839SUM
Saliency DetectionMIT300SIM0.63SUM
Saliency DetectionSALECIKL0.473SUM
Saliency DetectionSALICONAUC0.876SUM
Saliency DetectionSALICONCC0.909SUM
Saliency DetectionSALICONKLD0.192SUM
Saliency DetectionSALICONNSS1.981SUM
Saliency DetectionSALICONSIM0.804SUM
Saliency PredictionCAT2000KL0.27SUM
Saliency PredictionMIT300AUC-Judd0.913SUM
Saliency PredictionMIT300CC0.768SUM
Saliency PredictionMIT300KLD0.563SUM
Saliency PredictionMIT300NSS2.839SUM
Saliency PredictionMIT300SIM0.63SUM
Saliency PredictionSALECIKL0.473SUM
Saliency PredictionSALICONAUC0.876SUM
Saliency PredictionSALICONCC0.909SUM
Saliency PredictionSALICONKLD0.192SUM
Saliency PredictionSALICONNSS1.981SUM
Saliency PredictionSALICONSIM0.804SUM
Few-Shot Transfer Learning for Saliency PredictionCAT2000KL0.27SUM
Few-Shot Transfer Learning for Saliency PredictionMIT300AUC-Judd0.913SUM
Few-Shot Transfer Learning for Saliency PredictionMIT300CC0.768SUM
Few-Shot Transfer Learning for Saliency PredictionMIT300KLD0.563SUM
Few-Shot Transfer Learning for Saliency PredictionMIT300NSS2.839SUM
Few-Shot Transfer Learning for Saliency PredictionMIT300SIM0.63SUM
Few-Shot Transfer Learning for Saliency PredictionSALECIKL0.473SUM
Few-Shot Transfer Learning for Saliency PredictionSALICONAUC0.876SUM
Few-Shot Transfer Learning for Saliency PredictionSALICONCC0.909SUM
Few-Shot Transfer Learning for Saliency PredictionSALICONKLD0.192SUM
Few-Shot Transfer Learning for Saliency PredictionSALICONNSS1.981SUM
Few-Shot Transfer Learning for Saliency PredictionSALICONSIM0.804SUM

Related Papers

COLIBRI Fuzzy Model: Color Linguistic-Based Representation and Interpretation2025-07-15Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis2025-07-15USD: A User-Intent-Driven Sampling and Dual-Debiasing Framework for Large-Scale Homepage Recommendations2025-07-09Potential Customer Lifetime Value in Financial Institutions: The Usage Of Open Banking Data to Improve CLV Estimation2025-06-28Scalable Subset Selection in Linear Mixed Models2025-06-25Feature Hallucination for Self-supervised Action Recognition2025-06-25The Shape of Consumer Behavior: A Symbolic and Topological Analysis of Time Series2025-06-24Which Company Adjustment Matter? Insights from Uplift Modeling on Financial Health2025-06-23