TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/AnoShift: A Distribution Shift Benchmark for Unsupervised ...

AnoShift: A Distribution Shift Benchmark for Unsupervised Anomaly Detection

Marius Dragoi, Elena Burceanu, Emanuela Haller, Andrei Manolache, Florin Brad

2022-06-30Unsupervised Anomaly DetectionIntrusion DetectionNetwork Intrusion Detection
PaperPDFCode(official)

Abstract

Analyzing the distribution shift of data is a growing research direction in nowadays Machine Learning (ML), leading to emerging new benchmarks that focus on providing a suitable scenario for studying the generalization properties of ML models. The existing benchmarks are focused on supervised learning, and to the best of our knowledge, there is none for unsupervised learning. Therefore, we introduce an unsupervised anomaly detection benchmark with data that shifts over time, built over Kyoto-2006+, a traffic dataset for network intrusion detection. This type of data meets the premise of shifting the input distribution: it covers a large time span ($10$ years), with naturally occurring changes over time (eg users modifying their behavior patterns, and software updates). We first highlight the non-stationary nature of the data, using a basic per-feature analysis, t-SNE, and an Optimal Transport approach for measuring the overall distribution distances between years. Next, we propose AnoShift, a protocol splitting the data in IID, NEAR, and FAR testing splits. We validate the performance degradation over time with diverse models, ranging from classical approaches to deep learning. Finally, we show that by acknowledging the distribution shift problem and properly addressing it, the performance can be improved compared to the classical training which assumes independent and identically distributed data (on average, by up to $3\%$ for our approach). Dataset and code are available at https://github.com/bit-ml/AnoShift/.

Results

TaskDatasetMetricValueModel
Anomaly DetectionAnoShiftROC-AUC FAR50.42COPOD
Anomaly DetectionAnoShiftROC-AUC IID85.62COPOD
Anomaly DetectionAnoShiftROC-AUC NEAR54.24COPOD
Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)80.89COPOD
Anomaly DetectionAnoShiftROC-AUC FAR49.57OC-SVM
Anomaly DetectionAnoShiftROC-AUC IID76.86OC-SVM
Anomaly DetectionAnoShiftROC-AUC NEAR71.43OC-SVM
Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)68.73OC-SVM
Anomaly DetectionAnoShiftROC-AUC FAR49.35SO-GAAL
Anomaly DetectionAnoShiftROC-AUC IID50.48SO-GAAL
Anomaly DetectionAnoShiftROC-AUC NEAR54.55SO-GAAL
Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)49.9SO-GAAL
Anomaly DetectionAnoShiftROC-AUC FAR49.19ECOD Li et al. (2022)
Anomaly DetectionAnoShiftROC-AUC IID84.76ECOD Li et al. (2022)
Anomaly DetectionAnoShiftROC-AUC NEAR44.87ECOD Li et al. (2022)
Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)79.41ECOD Li et al. (2022)
Anomaly DetectionAnoShiftROC-AUC FAR34.96LOF
Anomaly DetectionAnoShiftROC-AUC IID91.5LOF
Anomaly DetectionAnoShiftROC-AUC NEAR79.29LOF
Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)87.61LOF
Anomaly DetectionAnoShiftROC-AUC FAR34.53deepSVDD
Anomaly DetectionAnoShiftROC-AUC IID92.67deepSVDD
Anomaly DetectionAnoShiftROC-AUC NEAR87deepSVDD
Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)88.24deepSVDD
Anomaly DetectionAnoShiftROC-AUC FAR28.19LUNAR
Anomaly DetectionAnoShiftROC-AUC IID85.75LUNAR
Anomaly DetectionAnoShiftROC-AUC NEAR49.03LUNAR
Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)78.53LUNAR
Anomaly DetectionAnoShiftROC-AUC FAR28.15BERT
Anomaly DetectionAnoShiftROC-AUC IID84.54BERT
Anomaly DetectionAnoShiftROC-AUC NEAR86.05BERT
Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)79.62BERT
Anomaly DetectionAnoShiftROC-AUC FAR27.16IsoForest
Anomaly DetectionAnoShiftROC-AUC IID86.09IsoForest
Anomaly DetectionAnoShiftROC-AUC NEAR75.26IsoForest
Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)81.27IsoForest
Anomaly DetectionAnoShiftROC-AUC FAR22.45Internal Contrastive Learning
Anomaly DetectionAnoShiftROC-AUC IID84.86Internal Contrastive Learning
Anomaly DetectionAnoShiftROC-AUC NEAR52.26Internal Contrastive Learning
Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)66.99Internal Contrastive Learning
Anomaly DetectionAnoShiftROC-AUC FAR19.96AE for anomalies
Anomaly DetectionAnoShiftROC-AUC IID81AE for anomalies
Anomaly DetectionAnoShiftROC-AUC NEAR44.06AE for anomalies
Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)64.08AE for anomalies
Unsupervised Anomaly DetectionAnoShiftROC-AUC FAR50.42COPOD
Unsupervised Anomaly DetectionAnoShiftROC-AUC IID85.62COPOD
Unsupervised Anomaly DetectionAnoShiftROC-AUC NEAR54.24COPOD
Unsupervised Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)80.89COPOD
Unsupervised Anomaly DetectionAnoShiftROC-AUC FAR49.57OC-SVM
Unsupervised Anomaly DetectionAnoShiftROC-AUC IID76.86OC-SVM
Unsupervised Anomaly DetectionAnoShiftROC-AUC NEAR71.43OC-SVM
Unsupervised Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)68.73OC-SVM
Unsupervised Anomaly DetectionAnoShiftROC-AUC FAR49.35SO-GAAL
Unsupervised Anomaly DetectionAnoShiftROC-AUC IID50.48SO-GAAL
Unsupervised Anomaly DetectionAnoShiftROC-AUC NEAR54.55SO-GAAL
Unsupervised Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)49.9SO-GAAL
Unsupervised Anomaly DetectionAnoShiftROC-AUC FAR49.19ECOD Li et al. (2022)
Unsupervised Anomaly DetectionAnoShiftROC-AUC IID84.76ECOD Li et al. (2022)
Unsupervised Anomaly DetectionAnoShiftROC-AUC NEAR44.87ECOD Li et al. (2022)
Unsupervised Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)79.41ECOD Li et al. (2022)
Unsupervised Anomaly DetectionAnoShiftROC-AUC FAR34.96LOF
Unsupervised Anomaly DetectionAnoShiftROC-AUC IID91.5LOF
Unsupervised Anomaly DetectionAnoShiftROC-AUC NEAR79.29LOF
Unsupervised Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)87.61LOF
Unsupervised Anomaly DetectionAnoShiftROC-AUC FAR34.53deepSVDD
Unsupervised Anomaly DetectionAnoShiftROC-AUC IID92.67deepSVDD
Unsupervised Anomaly DetectionAnoShiftROC-AUC NEAR87deepSVDD
Unsupervised Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)88.24deepSVDD
Unsupervised Anomaly DetectionAnoShiftROC-AUC FAR28.19LUNAR
Unsupervised Anomaly DetectionAnoShiftROC-AUC IID85.75LUNAR
Unsupervised Anomaly DetectionAnoShiftROC-AUC NEAR49.03LUNAR
Unsupervised Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)78.53LUNAR
Unsupervised Anomaly DetectionAnoShiftROC-AUC FAR28.15BERT
Unsupervised Anomaly DetectionAnoShiftROC-AUC IID84.54BERT
Unsupervised Anomaly DetectionAnoShiftROC-AUC NEAR86.05BERT
Unsupervised Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)79.62BERT
Unsupervised Anomaly DetectionAnoShiftROC-AUC FAR27.16IsoForest
Unsupervised Anomaly DetectionAnoShiftROC-AUC IID86.09IsoForest
Unsupervised Anomaly DetectionAnoShiftROC-AUC NEAR75.26IsoForest
Unsupervised Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)81.27IsoForest
Unsupervised Anomaly DetectionAnoShiftROC-AUC FAR22.45Internal Contrastive Learning
Unsupervised Anomaly DetectionAnoShiftROC-AUC IID84.86Internal Contrastive Learning
Unsupervised Anomaly DetectionAnoShiftROC-AUC NEAR52.26Internal Contrastive Learning
Unsupervised Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)66.99Internal Contrastive Learning
Unsupervised Anomaly DetectionAnoShiftROC-AUC FAR19.96AE for anomalies
Unsupervised Anomaly DetectionAnoShiftROC-AUC IID81AE for anomalies
Unsupervised Anomaly DetectionAnoShiftROC-AUC NEAR44.06AE for anomalies
Unsupervised Anomaly DetectionAnoShiftROC-AUC-ID (In-Distribution setup)64.08AE for anomalies

Related Papers

CyberRAG: An agentic RAG cyber attack classification and reporting tool2025-07-03Detection of Cyber Attack in Network using Machine Learning Techniques.2025-07-02Generative Adversarial Evasion and Out-of-Distribution Detection for UAV Cyber-Attacks2025-06-26Poster: Enhancing GNN Robustness for Network Intrusion Detection via Agent-based Analysis2025-06-25KnowML: Improving Generalization of ML-NIDS with Attack Knowledge Graphs2025-06-24Robust Anomaly Detection in Network Traffic: Evaluating Machine Learning Models on CICIDS20172025-06-23Dynamic Temporal Positional Encodings for Early Intrusion Detection in IoT2025-06-22On the Performance of Cyber-Biomedical Features for Intrusion Detection in Healthcare 5.02025-06-19