Junho Kim, Jaehyeok Bae, Gangin Park, Dongsu Zhang, Young Min Kim
We introduce N-ImageNet, a large-scale dataset targeted for robust, fine-grained object recognition with event cameras. The dataset is collected using programmable hardware in which an event camera consistently moves around a monitor displaying images from ImageNet. N-ImageNet serves as a challenging benchmark for event-based object recognition, due to its large number of classes and samples. We empirically show that pretraining on N-ImageNet improves the performance of event-based classifiers and helps them learn with few labeled data. In addition, we present several variants of N-ImageNet to test the robustness of event-based classifiers under diverse camera trajectories and severe lighting conditions, and propose a novel event representation to alleviate the performance degradation. To the best of our knowledge, we are the first to quantitatively investigate the consequences caused by various environmental conditions on event-based object recognition algorithms. N-ImageNet and its variants are expected to guide practical implementations for deploying event-based object recognition algorithms in the real world.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Classification | N-ImageNet (mini) | Accuracy (%) | 61.42 | Event Imge |
| Classification | N-ImageNet (mini) | Accuracy (%) | 61.02 | Event Histogram |
| Classification | N-ImageNet (mini) | Accuracy (%) | 60.46 | Timestamp Image |
| Classification | N-ImageNet (mini) | Accuracy (%) | 59.74 | DiST |
| Classification | N-ImageNet (mini) | Accuracy (%) | 58.38 | Sorted Time Surface |
| Classification | N-ImageNet (mini) | Accuracy (%) | 53.52 | Binary Event Image |
| Classification | N-ImageNet | Accuracy (%) | 48.93 | Event Spike Tensor |
| Classification | N-ImageNet | Accuracy (%) | 48.43 | DiST |
| Classification | N-ImageNet | Accuracy (%) | 47.9 | Sorted Time Surface |
| Classification | N-ImageNet | Accuracy (%) | 47.73 | Event Histogram |
| Classification | N-ImageNet | Accuracy (%) | 47.14 | HATS |
| Classification | N-ImageNet | Accuracy (%) | 46.36 | Binary Event Image |
| Classification | N-ImageNet | Accuracy (%) | 45.86 | Timestamp Image |
| Classification | N-ImageNet | Accuracy (%) | 45.77 | Event Image |
| Classification | N-ImageNet | Accuracy (%) | 44.32 | Time Surface |