DETRs with Collaborative Hybrid Assignments Training

Zhuofan Zong, Guanglu Song, Yu Liu

2022-11-22ICCV 2023 1Instance Segmentation Object Detection

Abstract

In this paper, we provide the observation that too few queries assigned as positive samples in DETR with one-to-one set matching leads to sparse supervision on the encoder's output which considerably hurt the discriminative feature learning of the encoder and vice visa for attention learning in the decoder. To alleviate this, we present a novel collaborative hybrid assignments training scheme, namely $\mathcal{C}$o-DETR, to learn more efficient and effective DETR-based detectors from versatile label assignment manners. This new training scheme can easily enhance the encoder's learning ability in end-to-end detectors by training the multiple parallel auxiliary heads supervised by one-to-many label assignments such as ATSS and Faster RCNN. In addition, we conduct extra customized positive queries by extracting the positive coordinates from these auxiliary heads to improve the training efficiency of positive samples in the decoder. In inference, these auxiliary heads are discarded and thus our method introduces no additional parameters and computational cost to the original detector while requiring no hand-crafted non-maximum suppression (NMS). We conduct extensive experiments to evaluate the effectiveness of the proposed approach on DETR variants, including DAB-DETR, Deformable-DETR, and DINO-Deformable-DETR. The state-of-the-art DINO-Deformable-DETR with Swin-L can be improved from 58.5% to 59.5% AP on COCO val. Surprisingly, incorporated with ViT-L backbone, we achieve 66.0% AP on COCO test-dev and 67.9% AP on LVIS val, outperforming previous methods by clear margins with much fewer model sizes. Codes are available at \url{https://github.com/Sense-X/Co-DETR}.

Results

Task	Dataset	Metric	Value	Model
Object Detection	LVIS v1.0 minival	box AP	72	Co-DETR (single-scale)
Object Detection	COCO test-dev	Params (M)	304	Co-DETR
Object Detection	COCO test-dev	box mAP	66	Co-DETR
Object Detection	COCO test-dev	Params (M)	218	Co-DETR (Swin-L)
Object Detection	COCO test-dev	box mAP	64.8	Co-DETR (Swin-L)
Object Detection	COCO minival	Params (M)	314	Co-DETR
Object Detection	COCO minival	box AP	65.9	Co-DETR
Object Detection	COCO minival	Params (M)	218	Co-DETR (Swin-L)
Object Detection	COCO minival	box AP	64.7	Co-DETR (Swin-L)
Object Detection	LVIS v1.0 val	box AP	68	Co-DETR (single-scale)
3D	LVIS v1.0 minival	box AP	72	Co-DETR (single-scale)
3D	COCO test-dev	Params (M)	304	Co-DETR
3D	COCO test-dev	box mAP	66	Co-DETR
3D	COCO test-dev	Params (M)	218	Co-DETR (Swin-L)
3D	COCO test-dev	box mAP	64.8	Co-DETR (Swin-L)
3D	COCO minival	Params (M)	314	Co-DETR
3D	COCO minival	box AP	65.9	Co-DETR
3D	COCO minival	Params (M)	218	Co-DETR (Swin-L)
3D	COCO minival	box AP	64.7	Co-DETR (Swin-L)
3D	LVIS v1.0 val	box AP	68	Co-DETR (single-scale)
Instance Segmentation	COCO minival	AP50	79.7	Co-DETR
Instance Segmentation	COCO minival	AP75	62.8	Co-DETR
Instance Segmentation	COCO minival	APL	74.6	Co-DETR
Instance Segmentation	COCO minival	APM	59.7	Co-DETR
Instance Segmentation	COCO minival	APS	38.9	Co-DETR
Instance Segmentation	COCO minival	mask AP	56.6	Co-DETR
Instance Segmentation	COCO test-dev	AP50	80.2	Co-DETR
Instance Segmentation	COCO test-dev	AP75	63.4	Co-DETR
Instance Segmentation	COCO test-dev	APL	72	Co-DETR
Instance Segmentation	COCO test-dev	APM	60.1	Co-DETR
Instance Segmentation	COCO test-dev	APS	41.6	Co-DETR
Instance Segmentation	COCO test-dev	mask AP	57.1	Co-DETR
Instance Segmentation	LVIS v1.0 val	mask AP	60.7	Co-DETR (single-scale)
2D Classification	LVIS v1.0 minival	box AP	72	Co-DETR (single-scale)
2D Classification	COCO test-dev	Params (M)	304	Co-DETR
2D Classification	COCO test-dev	box mAP	66	Co-DETR
2D Classification	COCO test-dev	Params (M)	218	Co-DETR (Swin-L)
2D Classification	COCO test-dev	box mAP	64.8	Co-DETR (Swin-L)
2D Classification	COCO minival	Params (M)	314	Co-DETR
2D Classification	COCO minival	box AP	65.9	Co-DETR
2D Classification	COCO minival	Params (M)	218	Co-DETR (Swin-L)
2D Classification	COCO minival	box AP	64.7	Co-DETR (Swin-L)
2D Classification	LVIS v1.0 val	box AP	68	Co-DETR (single-scale)
2D Object Detection	LVIS v1.0 minival	box AP	72	Co-DETR (single-scale)
2D Object Detection	COCO test-dev	Params (M)	304	Co-DETR
2D Object Detection	COCO test-dev	box mAP	66	Co-DETR
2D Object Detection	COCO test-dev	Params (M)	218	Co-DETR (Swin-L)
2D Object Detection	COCO test-dev	box mAP	64.8	Co-DETR (Swin-L)
2D Object Detection	COCO minival	Params (M)	314	Co-DETR
2D Object Detection	COCO minival	box AP	65.9	Co-DETR
2D Object Detection	COCO minival	Params (M)	218	Co-DETR (Swin-L)
2D Object Detection	COCO minival	box AP	64.7	Co-DETR (Swin-L)
2D Object Detection	LVIS v1.0 val	box AP	68	Co-DETR (single-scale)
16k	LVIS v1.0 minival	box AP	72	Co-DETR (single-scale)
16k	COCO test-dev	Params (M)	304	Co-DETR
16k	COCO test-dev	box mAP	66	Co-DETR
16k	COCO test-dev	Params (M)	218	Co-DETR (Swin-L)
16k	COCO test-dev	box mAP	64.8	Co-DETR (Swin-L)
16k	COCO minival	Params (M)	314	Co-DETR
16k	COCO minival	box AP	65.9	Co-DETR
16k	COCO minival	Params (M)	218	Co-DETR (Swin-L)
16k	COCO minival	box AP	64.7	Co-DETR (Swin-L)
16k	LVIS v1.0 val	box AP	68	Co-DETR (single-scale)

DETRs with Collaborative Hybrid Assignments Training

Abstract

Results

Related Papers

DETRs with Collaborative Hybrid Assignments Training

Abstract

Results

Related Papers