Metric: METEOR (higher is better)
| # | Model↕ | METEOR▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | 3D CoCa | 30.95 | No | 3D CoCa: Contrastive Learners are 3D Captioners | 2025-04-13 | Code |
| 2 | Vote2Cap-DETR++ | 28.7 | No | Vote2Cap-DETR++: Decoupling Localization and Des... | 2023-09-06 | Code |
| 3 | Vote2Cap-DETR | 28.25 | No | End-to-End 3D Dense Captioning with Vote2Cap-DETR | 2023-01-06 | Code |
| 4 | See It All | 27.92 | No | See It All: Contextualized Late Aggregation for ... | 2024-08-14 | - |
| 5 | BiCA | 27.76 | No | Bi-directional Contextual Attention for 3D Dense... | 2024-08-13 | - |
| 6 | 3DJCG | 27.45 | No | - | - | - |
| 7 | MORE | 26.36 | No | MORE: Multi-Order RElation Mining for Dense Capt... | 2022-03-10 | Code |
| 8 | SpaCap3d | 26.16 | No | Spatiality-guided Transformer for 3D Dense Capti... | 2022-04-22 | Code |
| 9 | Scan2Cap | 26.14 | No | Scan2Cap: Context-aware Dense Captioning in RGB-... | 2020-12-03 | - |
| 10 | 3D-VLP | 24.53 | No | - | - | Code |
| 11 | Contextual | 22.57 | No | Contextual Modeling for 3D Dense Captioning on P... | 2022-10-08 | - |
| 12 | χ-Tran2Cap | 21.9 | No | X-Trans2Cap: Cross-Modal Knowledge Transfer usin... | 2022-03-02 | Code |