Zhen Zhao, Yuhong Guo, Haifeng Shen, Jieping Ye
In this paper, we propose a novel end-to-end unsupervised deep domain adaptation model for adaptive object detection by exploiting multi-label object recognition as a dual auxiliary task. The model exploits multi-label prediction to reveal the object category information in each image and then uses the prediction results to perform conditional adversarial global feature alignment, such that the multi-modal structure of image features can be tackled to bridge the domain divergence at the global feature level while preserving the discriminability of the features. Moreover, we introduce a prediction consistency regularization mechanism to assist object detection, which uses the multi-label prediction results as an auxiliary regularization information to ensure consistent object category discoveries between the object recognition task and the object detection task. Experiments are conducted on a few benchmark datasets and the results show the proposed model outperforms the state-of-the-art comparison methods.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image-to-Image Translation | Cityscapes-to-Foggy Cityscapes | mAP | 38.8 | MCAR |
| Domain Adaptation | Cityscapes to Foggy Cityscapes | mAP@0.5 | 38.8 | MCAR |
| Image Generation | Cityscapes-to-Foggy Cityscapes | mAP | 38.8 | MCAR |
| Object Detection | Watercolor2k | MAP | 56 | MCAR |
| 3D | Watercolor2k | MAP | 56 | MCAR |
| Unsupervised Domain Adaptation | Cityscapes to Foggy Cityscapes | mAP@0.5 | 38.8 | MCAR |
| 2D Classification | Watercolor2k | MAP | 56 | MCAR |
| 2D Object Detection | Watercolor2k | MAP | 56 | MCAR |
| 16k | Watercolor2k | MAP | 56 | MCAR |
| 1 Image, 2*2 Stitching | Cityscapes-to-Foggy Cityscapes | mAP | 38.8 | MCAR |