TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Objectron: A Large Scale Dataset of Object-Centric Videos ...

Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations

Adel Ahmadyan, Liangkai Zhang, Jianing Wei, Artsiom Ablavatski, Matthias Grundmann

2020-12-18CVPR 2021 1Monocular 3D Object Detection3D Shape Representation3D Object TrackingObject TrackingRetrievalobject-detection3D Object DetectionObject DetectionImage Retrieval
PaperPDFCode(official)

Abstract

3D object detection has recently become popular due to many applications in robotics, augmented reality, autonomy, and image retrieval. We introduce the Objectron dataset to advance the state of the art in 3D object detection and foster new research and applications, such as 3D object tracking, view synthesis, and improved 3D shape representation. The dataset contains object-centric short videos with pose annotations for nine categories and includes 4 million annotated images in 14,819 annotated videos. We also propose a new evaluation metric, 3D Intersection over Union, for 3D object detection. We demonstrate the usefulness of our dataset in 3D object detection tasks by providing baseline models trained on this dataset. Our dataset and evaluation source code are available online at http://www.objectron.dev

Results

TaskDatasetMetricValueModel
Object DetectionGoogle ObjectronAP at 10' Elevation error0.8584EfficientNetLite + keypoint regressor
Object DetectionGoogle ObjectronAP at 15' Azimuth error0.7844EfficientNetLite + keypoint regressor
Object DetectionGoogle ObjectronAverage Precision at 0.5 3D IoU0.6512EfficientNetLite + keypoint regressor
Object DetectionGoogle ObjectronMPE0.0467EfficientNetLite + keypoint regressor
3DGoogle ObjectronAP at 10' Elevation error0.8584EfficientNetLite + keypoint regressor
3DGoogle ObjectronAP at 15' Azimuth error0.7844EfficientNetLite + keypoint regressor
3DGoogle ObjectronAverage Precision at 0.5 3D IoU0.6512EfficientNetLite + keypoint regressor
3DGoogle ObjectronMPE0.0467EfficientNetLite + keypoint regressor
3D Object DetectionGoogle ObjectronAP at 10' Elevation error0.8584EfficientNetLite + keypoint regressor
3D Object DetectionGoogle ObjectronAP at 15' Azimuth error0.7844EfficientNetLite + keypoint regressor
3D Object DetectionGoogle ObjectronAverage Precision at 0.5 3D IoU0.6512EfficientNetLite + keypoint regressor
3D Object DetectionGoogle ObjectronMPE0.0467EfficientNetLite + keypoint regressor
2D ClassificationGoogle ObjectronAP at 10' Elevation error0.8584EfficientNetLite + keypoint regressor
2D ClassificationGoogle ObjectronAP at 15' Azimuth error0.7844EfficientNetLite + keypoint regressor
2D ClassificationGoogle ObjectronAverage Precision at 0.5 3D IoU0.6512EfficientNetLite + keypoint regressor
2D ClassificationGoogle ObjectronMPE0.0467EfficientNetLite + keypoint regressor
2D Object DetectionGoogle ObjectronAP at 10' Elevation error0.8584EfficientNetLite + keypoint regressor
2D Object DetectionGoogle ObjectronAP at 15' Azimuth error0.7844EfficientNetLite + keypoint regressor
2D Object DetectionGoogle ObjectronAverage Precision at 0.5 3D IoU0.6512EfficientNetLite + keypoint regressor
2D Object DetectionGoogle ObjectronMPE0.0467EfficientNetLite + keypoint regressor
16kGoogle ObjectronAP at 10' Elevation error0.8584EfficientNetLite + keypoint regressor
16kGoogle ObjectronAP at 15' Azimuth error0.7844EfficientNetLite + keypoint regressor
16kGoogle ObjectronAverage Precision at 0.5 3D IoU0.6512EfficientNetLite + keypoint regressor
16kGoogle ObjectronMPE0.0467EfficientNetLite + keypoint regressor

Related Papers

MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results2025-07-17From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17MCoT-RE: Multi-Faceted Chain-of-Thought and Re-Ranking for Training-Free Zero-Shot Composed Image Retrieval2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17