TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where M...

A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others

Zhiheng Li, Ivan Evtimov, Albert Gordo, Caner Hazirbas, Tal Hassner, Cristian Canton Ferrer, Chenliang Xu, Mark Ibrahim

2022-12-09CVPR 2023 1Image ClassificationDomain GeneralizationOut-of-Distribution Generalization
PaperPDFCode(official)

Abstract

Machine learning models have been found to learn shortcuts -- unintended decision rules that are unable to generalize -- undermining models' reliability. Previous works address this problem under the tenuous assumption that only a single shortcut exists in the training data. Real-world images are rife with multiple visual cues from background to texture. Key to advancing the reliability of vision systems is understanding whether existing methods can overcome multiple shortcuts or struggle in a Whac-A-Mole game, i.e., where mitigating one shortcut amplifies reliance on others. To address this shortcoming, we propose two benchmarks: 1) UrbanCars, a dataset with precisely controlled spurious cues, and 2) ImageNet-W, an evaluation set based on ImageNet for watermark, a shortcut we discovered affects nearly every modern vision model. Along with texture and background, ImageNet-W allows us to study multiple shortcuts emerging from training on natural images. We find computer vision models, including large foundation models -- regardless of training set, architecture, and supervision -- struggle when multiple shortcuts are present. Even methods explicitly designed to combat shortcuts struggle in a Whac-A-Mole dilemma. To tackle this challenge, we propose Last Layer Ensemble, a simple-yet-effective method to mitigate multiple shortcuts without Whac-A-Mole behavior. Our results surface multi-shortcut mitigation as an overlooked challenge critical to advancing the reliability of vision systems. The datasets and code are released: https://github.com/facebookresearch/Whac-A-Mole.

Results

TaskDatasetMetricValueModel
Domain AdaptationImageNet-RTop-1 Error Rate31.3LLE (ViT-B/16, SWAG, Edge Aug)
Domain AdaptationImageNet-RTop-1 Error Rate33.1LLE (ViT-H/14, MAE, Edge Aug)
Domain AdaptationImageNet-SketchTop-1 accuracy53.39LLE (ViT-H/14, MAE, Edge Aug)
Image ClassificationObjectNetTop-1 Accuracy60.78LLE (ViT-H/14, MAE, Edge Aug)
Domain GeneralizationImageNet-RTop-1 Error Rate31.3LLE (ViT-B/16, SWAG, Edge Aug)
Domain GeneralizationImageNet-RTop-1 Error Rate33.1LLE (ViT-H/14, MAE, Edge Aug)
Domain GeneralizationImageNet-SketchTop-1 accuracy53.39LLE (ViT-H/14, MAE, Edge Aug)

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Simulate, Refocus and Ensemble: An Attention-Refocusing Scheme for Domain Generalization2025-07-17GLAD: Generalizable Tuning for Vision-Language Models2025-07-17MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling2025-07-17