Image Classification on EarlyNSD

Metric: Test f1 (higher is better)

LeaderboardDataset