NMS Strikes Back

Jeffrey Ouyang-Zhang, Jang Hyun Cho, Xingyi Zhou, Philipp Krähenbühl

2022-12-12Attribute object-detection Object Detection

Abstract

Detection Transformer (DETR) directly transforms queries to unique objects by using one-to-one bipartite matching during training and enables end-to-end object detection. Recently, these models have surpassed traditional detectors on COCO with undeniable elegance. However, they differ from traditional detectors in multiple designs, including model architecture and training schedules, and thus the effectiveness of one-to-one matching is not fully understood. In this work, we conduct a strict comparison between the one-to-one Hungarian matching in DETRs and the one-to-many label assignments in traditional detectors with non-maximum supervision (NMS). Surprisingly, we observe one-to-many assignments with NMS consistently outperform standard one-to-one matching under the same setting, with a significant gain of up to 2.5 mAP. Our detector that trains Deformable-DETR with traditional IoU-based label assignment achieved 50.2 COCO mAP within 12 epochs (1x schedule) with ResNet50 backbone, outperforming all existing traditional or transformer-based detectors in this setting. On multiple datasets, schedules, and architectures, we consistently show bipartite matching is unnecessary for performant detection transformers. Furthermore, we attribute the success of detection transformers to their expressive transformer architecture. Code is available at https://github.com/jozhang97/DETA.

Results

Task	Dataset	Metric	Value	Model
Object Detection	COCO test-dev	AP50	80.4	DETA (Swin-L)
Object Detection	COCO test-dev	AP75	70.2	DETA (Swin-L)
Object Detection	COCO test-dev	APL	76.9	DETA (Swin-L)
Object Detection	COCO test-dev	APM	66.9	DETA (Swin-L)
Object Detection	COCO test-dev	APS	46.1	DETA (Swin-L)
Object Detection	COCO test-dev	box mAP	63.5	DETA (Swin-L)
Object Detection	COCO-O	Average mAP	48.5	DETA (Swin-L)
Object Detection	COCO-O	Effective Robustness	20.15	DETA (Swin-L)
3D	COCO test-dev	AP50	80.4	DETA (Swin-L)
3D	COCO test-dev	AP75	70.2	DETA (Swin-L)
3D	COCO test-dev	APL	76.9	DETA (Swin-L)
3D	COCO test-dev	APM	66.9	DETA (Swin-L)
3D	COCO test-dev	APS	46.1	DETA (Swin-L)
3D	COCO test-dev	box mAP	63.5	DETA (Swin-L)
3D	COCO-O	Average mAP	48.5	DETA (Swin-L)
3D	COCO-O	Effective Robustness	20.15	DETA (Swin-L)
2D Classification	COCO test-dev	AP50	80.4	DETA (Swin-L)
2D Classification	COCO test-dev	AP75	70.2	DETA (Swin-L)
2D Classification	COCO test-dev	APL	76.9	DETA (Swin-L)
2D Classification	COCO test-dev	APM	66.9	DETA (Swin-L)
2D Classification	COCO test-dev	APS	46.1	DETA (Swin-L)
2D Classification	COCO test-dev	box mAP	63.5	DETA (Swin-L)
2D Classification	COCO-O	Average mAP	48.5	DETA (Swin-L)
2D Classification	COCO-O	Effective Robustness	20.15	DETA (Swin-L)
2D Object Detection	COCO test-dev	AP50	80.4	DETA (Swin-L)
2D Object Detection	COCO test-dev	AP75	70.2	DETA (Swin-L)
2D Object Detection	COCO test-dev	APL	76.9	DETA (Swin-L)
2D Object Detection	COCO test-dev	APM	66.9	DETA (Swin-L)
2D Object Detection	COCO test-dev	APS	46.1	DETA (Swin-L)
2D Object Detection	COCO test-dev	box mAP	63.5	DETA (Swin-L)
2D Object Detection	COCO-O	Average mAP	48.5	DETA (Swin-L)
2D Object Detection	COCO-O	Effective Robustness	20.15	DETA (Swin-L)
16k	COCO test-dev	AP50	80.4	DETA (Swin-L)
16k	COCO test-dev	AP75	70.2	DETA (Swin-L)
16k	COCO test-dev	APL	76.9	DETA (Swin-L)
16k	COCO test-dev	APM	66.9	DETA (Swin-L)
16k	COCO test-dev	APS	46.1	DETA (Swin-L)
16k	COCO test-dev	box mAP	63.5	DETA (Swin-L)
16k	COCO-O	Average mAP	48.5	DETA (Swin-L)
16k	COCO-O	Effective Robustness	20.15	DETA (Swin-L)

NMS Strikes Back

Abstract

Results

Related Papers

NMS Strikes Back

Abstract

Results

Related Papers