Wei Luo, Yunkang Cao, Haiming Yao, Xiaotian Zhang, Jianan Lou, Yuqi Cheng, Weiming Shen, Wenyong Yu
Anomaly detection (AD) is essential for industrial inspection, yet existing methods typically rely on ``comparing'' test images to normal references from a training set. However, variations in appearance and positioning often complicate the alignment of these references with the test image, limiting detection accuracy. We observe that most anomalies manifest as local variations, meaning that even within anomalous images, valuable normal information remains. We argue that this information is useful and may be more aligned with the anomalies since both the anomalies and the normal information originate from the same image. Therefore, rather than relying on external normality from the training set, we propose INP-Former, a novel method that extracts Intrinsic Normal Prototypes (INPs) directly from the test image. Specifically, we introduce the INP Extractor, which linearly combines normal tokens to represent INPs. We further propose an INP Coherence Loss to ensure INPs can faithfully represent normality for the testing image. These INPs then guide the INP-Guided Decoder to reconstruct only normal tokens, with reconstruction errors serving as anomaly scores. Additionally, we propose a Soft Mining Loss to prioritize hard-to-optimize samples during training. INP-Former achieves state-of-the-art performance in single-class, multi-class, and few-shot AD tasks across MVTec-AD, VisA, and Real-IAD, positioning it as a versatile and universal solution for AD. Remarkably, INP-Former also demonstrates some zero-shot AD capability. Code is available at:https://github.com/luow23/INP-Former.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Anomaly Detection | MVTec AD | Detection AUROC | 99.8 | INP-Fomer ViT-L (model-unified multi-class) |
| Anomaly Detection | MVTec AD | Segmentation AP | 72.1 | INP-Fomer ViT-L (model-unified multi-class) |
| Anomaly Detection | MVTec AD | Segmentation AUPRO | 95.6 | INP-Fomer ViT-L (model-unified multi-class) |
| Anomaly Detection | MVTec AD | Segmentation AUROC | 98.6 | INP-Fomer ViT-L (model-unified multi-class) |
| Anomaly Detection | VisA | Detection AUROC | 98.9 | INP-Former ViT-B (model-unified multi-class) |
| Anomaly Detection | VisA | F1-Score | 96.6 | INP-Former ViT-B (model-unified multi-class) |
| Anomaly Detection | VisA | Segmentation AUPRO | 94.4 | INP-Former ViT-B (model-unified multi-class) |
| Anomaly Detection | VisA | Segmentation AUPRO (until 30% FPR) | 94.4 | INP-Former ViT-B (model-unified multi-class) |
| Anomaly Detection | VisA | Segmentation AUROC | 98.9 | INP-Former ViT-B (model-unified multi-class) |
| Anomaly Detection | MVTec AD | Detection AUROC | 99.8 | INP-Former-Large |
| Anomaly Detection | MVTec AD | Segmentation AUROC | 98.6 | INP-Former-Large |
| Anomaly Detection | MVTec AD | Detection AUROC | 99.7 | INP-Former-Base |
| Anomaly Detection | MVTec AD | Segmentation AUROC | 98.5 | INP-Former-Base |