Shyam Nandan Rai, Fabio Cermelli, Dario Fontanel, Carlo Masone, Barbara Caputo
Anomaly segmentation is a critical task for driving applications, and it is approached traditionally as a per-pixel classification problem. However, reasoning individually about each pixel without considering their contextual semantics results in high uncertainty around the objects' boundaries and numerous false positives. We propose a paradigm change by shifting from a per-pixel classification to a mask classification. Our mask-based method, Mask2Anomaly, demonstrates the feasibility of integrating an anomaly detection method in a mask-classification architecture. Mask2Anomaly includes several technical novelties that are designed to improve the detection of anomalies in masks: i) a global masked attention module to focus individually on the foreground and background regions; ii) a mask contrastive learning that maximizes the margin between an anomaly and known classes; and iii) a mask refinement solution to reduce false positives. Mask2Anomaly achieves new state-of-the-art results across a range of benchmarks, both in the per-pixel and component-level evaluations. In particular, Mask2Anomaly reduces the average false positives rate by 60% wrt the previous state-of-the-art. Github page: https://github.com/shyam671/Mask2Anomaly-Unmasking-Anomalies-in-Road-Scene-Segmentation.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Anomaly Detection | Road Anomaly | AP | 79.7 | Mask2Anomaly |
| Anomaly Detection | Road Anomaly | FPR95 | 13.45 | Mask2Anomaly |
| Anomaly Detection | Fishyscapes | AP | 95.2 | Mask2Anomaly |
| Anomaly Detection | Fishyscapes | FPR95 | 0.82 | Mask2Anomaly |
| Anomaly Detection | Lost and Found | AP | 86.59 | Mask2Anomaly |
| Anomaly Detection | Lost and Found | FPR | 5.75 | Mask2Anomaly |
| Anomaly Detection | Fishyscapes L&F | AP | 46.04 | Mask2Anomaly |
| Anomaly Detection | Fishyscapes L&F | FPR95 | 4.36 | Mask2Anomaly |
| Semantic Segmentation | StreetHazards | Open-mIoU | 59.8 | Mask2Anomaly |
| Object Detection | OoDIS | AP | 1.24 | Mask2Anomaly |
| Object Detection | OoDIS | AP50 | 2.23 | Mask2Anomaly |
| 3D | OoDIS | AP | 1.24 | Mask2Anomaly |
| 3D | OoDIS | AP50 | 2.23 | Mask2Anomaly |
| Instance Segmentation | OoDIS | AP | 13.73 | Mask2Anomaly |
| Instance Segmentation | OoDIS | AP50 | 24.3 | Mask2Anomaly |
| 2D Classification | OoDIS | AP | 1.24 | Mask2Anomaly |
| 2D Classification | OoDIS | AP50 | 2.23 | Mask2Anomaly |
| Scene Segmentation | StreetHazards | Open-mIoU | 59.8 | Mask2Anomaly |
| 2D Object Detection | OoDIS | AP | 1.24 | Mask2Anomaly |
| 2D Object Detection | OoDIS | AP50 | 2.23 | Mask2Anomaly |
| 10-shot image generation | StreetHazards | Open-mIoU | 59.8 | Mask2Anomaly |
| 16k | OoDIS | AP | 1.24 | Mask2Anomaly |
| 16k | OoDIS | AP50 | 2.23 | Mask2Anomaly |