OTTO Recommender Systems Dataset

TabularCC BY 4.0Introduced 2022-11-01

The OTTO session dataset is a large-scale dataset intended for multi-objective recommendation research. We collected the data from anonymized behavior logs of the OTTO webshop and the app. The mission of this dataset is to serve as a benchmark for session-based recommendations and foster research in the multi-objective and session-based recommender systems area. We also launched a Kaggle competition with the goal to predict clicks, cart additions, and orders based on previous events in a user session.

For additional background, please see the published OTTO Recommender Systems Dataset GitHub.

Key Features

  • 12M real-world anonymized user sessions
  • 220M events, consiting of clicks, carts and orders
  • 1.8M unique articles in the catalogue
  • Ready to use data in .jsonl format
  • Evaluation metrics for multi-objective optimization

Dataset Statistics

| Dataset | #sessions | #items | #events | #clicks | #carts | #orders | Density [%] | | :------ | ---------: | --------: | ----------: | ----------: | ---------: | --------: | ----------: | | Train | 12.899.779 | 1.855.603 | 216.716.096 | 194.720.954 | 16.896.191 | 5.098.951 | 0.0005 | | Test | 1.671.803 | 1.019.357 | 13.851.293 | 12.340.303 | 1.155.698 | 355.292 | 0.0005 |