3D on LVIS v1.0 val

Metric: box AP (higher is better)

LeaderboardDataset

Loading chart...

Results

Hide augmentations

Sort:

#	Model↕	box AP▼	Augmentations	Paper	Date↕	Code
1	Co-DETR (single-scale)	68	Yes	DETRs with Collaborative Hybrid Assignments Trai...	2022-11-22	Code
2	Grounding DINO 1.5 Pro	63.5	Yes	Grounding DINO 1.5: Advance the "Edge" of Open-S...	2024-05-16	Code
3	InternImage-H	63.2	Yes	InternImage: Exploring Large-Scale Vision Founda...	2022-11-10	Code
4	EVA	62.2	Yes	EVA: Exploring the Limits of Masked Visual Repre...	2022-11-14	Code
5	RichSem (Focal-H + ImageNet as weakly-supervised extra data)	61.2	Yes	Learning from Rich Semantics and Coarse Location...	2023-10-18	Code
6	GLEE-Pro	55.7	Yes	General Object Foundation Model for Images and V...	2023-12-14	Code
7	ViTDet-H	53.4	No	Exploring Plain Vision Transformer Backbones for...	2022-03-30	Code
8	SimLTD w/MixPL (Swin-L + COCO unlabeled images)	51.5	Yes	SimLTD: Simple Supervised and Semi-Supervised Lo...	2024-12-28	Code
9	DiverGen (Swin-L)	51.2	No	DiverGen: Improving Instance Segmentation by Lea...	2024-05-16	Code
10	ViTDet-L	51.2	No	Exploring Plain Vision Transformer Backbones for...	2022-03-30	Code
11	CenterNet2 (Swin-L w/ X-Paste + Copy-Paste)	50.9	No	X-Paste: Revisiting Scalable Copy-Paste for Inst...	2022-12-07	Code
12	SimLTD Fully Supervised (Swin-L)	49.8	No	SimLTD: Simple Supervised and Semi-Supervised Lo...	2024-12-28	Code
13	Eff-B7 NAS-FPN (1280, Copy-Paste pre-training))	41.6	No	Simple Copy-Paste is a Strong Data Augmentation ...	2020-12-13	Code
14	R101-MaskRCNN-LOCE	29	No	Exploring Classification Equilibrium in Long-Tai...	2021-08-17	Code
15	R50-MaskRCNN-LOCE	27.4	No	Exploring Classification Equilibrium in Long-Tai...	2021-08-17	Code

#1Co-DETR (single-scale)SOTA
68
box AP· Augmentations· 2022-11-22
DETRs with Collaborative Hybrid Assignments Training Code
#2Grounding DINO 1.5 Pro
63.5
box AP· Augmentations· 2024-05-16
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection Code
#3InternImage-HSOTA
63.2
box AP· Augmentations· 2022-11-10
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions Code
#4EVA
62.2
box AP· Augmentations· 2022-11-14
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale Code
#5RichSem (Focal-H + ImageNet as weakly-supervised extra data)
61.2
box AP· Augmentations· 2023-10-18
Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection Code
#6GLEE-Pro
55.7
box AP· Augmentations· 2023-12-14
General Object Foundation Model for Images and Videos at Scale Code
#7ViTDet-HSOTA
53.4
box AP· 2022-03-30
Exploring Plain Vision Transformer Backbones for Object Detection Code
#8SimLTD w/MixPL (Swin-L + COCO unlabeled images)
51.5
box AP· Augmentations· 2024-12-28
SimLTD: Simple Supervised and Semi-Supervised Long-Tailed Object Detection Code
#9DiverGen (Swin-L)
51.2
box AP· 2024-05-16
DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data Code
#10ViTDet-L
51.2
box AP· 2022-03-30
Exploring Plain Vision Transformer Backbones for Object Detection Code
#11CenterNet2 (Swin-L w/ X-Paste + Copy-Paste)
50.9
box AP· 2022-12-07
X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion Code
#12SimLTD Fully Supervised (Swin-L)
49.8
box AP· 2024-12-28
SimLTD: Simple Supervised and Semi-Supervised Long-Tailed Object Detection Code
#13Eff-B7 NAS-FPN (1280, Copy-Paste pre-training))SOTA
41.6
box AP· 2020-12-13
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation Code
#14R101-MaskRCNN-LOCE
29
box AP· 2021-08-17
Exploring Classification Equilibrium in Long-Tailed Object Detection Code
#15R50-MaskRCNN-LOCE
27.4
box AP· 2021-08-17
Exploring Classification Equilibrium in Long-Tailed Object Detection Code