Theoretically Principled Trade-off between Robustness and Accuracy

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, Michael. I. Jordan

2019-01-24Adversarial Robustness Adversarial Defense Adversarial Attack General Classification

Paper PDF Code Code Code Code Code(official)Code Code Code Code

Abstract

We identify a trade-off between robustness and accuracy that serves as a guiding principle in the design of defenses against adversarial examples. Although this problem has been widely studied empirically, much remains unknown concerning the theory underlying this trade-off. In this work, we decompose the prediction error for adversarial examples (robust error) as the sum of the natural (classification) error and boundary error, and provide a differentiable upper bound using the theory of classification-calibrated loss, which is shown to be the tightest possible upper bound uniform over all probability distributions and measurable predictors. Inspired by our theoretical analysis, we also design a new defense method, TRADES, to trade adversarial robustness off against accuracy. Our proposed algorithm performs well experimentally in real-world datasets. The methodology is the foundation of our entry to the NeurIPS 2018 Adversarial Vision Challenge in which we won the 1st place out of ~2,000 submissions, surpassing the runner-up approach by $11.41\%$ in terms of mean $\ell_2$ perturbation distance.

Results

Task	Dataset	Metric	Value	Model
Adversarial Attack	CIFAR-10	Attack: PGD20	45.9	TRADES [zhang2019b]

Related Papers

Bridging Robustness and Generalization Against Word Substitution Attacks in NLP via the Growth Bound Matrix Approach2025-07-14 3DGAA: Realistic and Robust 3D Gaussian-based Adversarial Attack for Autonomous Driving2025-07-14 VIP: Visual Information Protection through Adversarial Attacks on Vision-Language Models2025-07-11 Identifying the Smallest Adversarial Load Perturbations that Render DC-OPF Infeasible2025-07-10 ScoreAdv: Score-based Targeted Generation of Natural Adversarial Examples via Diffusion Models2025-07-08 Tail-aware Adversarial Attacks: A Distributional Approach to Efficient LLM Jailbreaking2025-07-06 Evaluating the Evaluators: Trust in Adversarial Robustness Tests2025-07-04 Rectifying Adversarial Sample with Low Entropy Prior for Test-Time Defense2025-07-04