Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations

Adel Ahmadyan, Liangkai Zhang, Jianing Wei, Artsiom Ablavatski, Matthias Grundmann

2020-12-18CVPR 2021 1Monocular 3D Object Detection 3D Shape Representation 3D Object Tracking Object Tracking Retrieval object-detection 3D Object Detection Object Detection Image Retrieval

Paper PDF Code(official)

Abstract

3D object detection has recently become popular due to many applications in robotics, augmented reality, autonomy, and image retrieval. We introduce the Objectron dataset to advance the state of the art in 3D object detection and foster new research and applications, such as 3D object tracking, view synthesis, and improved 3D shape representation. The dataset contains object-centric short videos with pose annotations for nine categories and includes 4 million annotated images in 14,819 annotated videos. We also propose a new evaluation metric, 3D Intersection over Union, for 3D object detection. We demonstrate the usefulness of our dataset in 3D object detection tasks by providing baseline models trained on this dataset. Our dataset and evaluation source code are available online at http://www.objectron.dev

Results

Task	Dataset	Metric	Value	Model
Object Detection	Google Objectron	AP at 10' Elevation error	0.8584	EfficientNetLite + keypoint regressor
Object Detection	Google Objectron	AP at 15' Azimuth error	0.7844	EfficientNetLite + keypoint regressor
Object Detection	Google Objectron	Average Precision at 0.5 3D IoU	0.6512	EfficientNetLite + keypoint regressor
Object Detection	Google Objectron	MPE	0.0467	EfficientNetLite + keypoint regressor
3D	Google Objectron	AP at 10' Elevation error	0.8584	EfficientNetLite + keypoint regressor
3D	Google Objectron	AP at 15' Azimuth error	0.7844	EfficientNetLite + keypoint regressor
3D	Google Objectron	Average Precision at 0.5 3D IoU	0.6512	EfficientNetLite + keypoint regressor
3D	Google Objectron	MPE	0.0467	EfficientNetLite + keypoint regressor
3D Object Detection	Google Objectron	AP at 10' Elevation error	0.8584	EfficientNetLite + keypoint regressor
3D Object Detection	Google Objectron	AP at 15' Azimuth error	0.7844	EfficientNetLite + keypoint regressor
3D Object Detection	Google Objectron	Average Precision at 0.5 3D IoU	0.6512	EfficientNetLite + keypoint regressor
3D Object Detection	Google Objectron	MPE	0.0467	EfficientNetLite + keypoint regressor
2D Classification	Google Objectron	AP at 10' Elevation error	0.8584	EfficientNetLite + keypoint regressor
2D Classification	Google Objectron	AP at 15' Azimuth error	0.7844	EfficientNetLite + keypoint regressor
2D Classification	Google Objectron	Average Precision at 0.5 3D IoU	0.6512	EfficientNetLite + keypoint regressor
2D Classification	Google Objectron	MPE	0.0467	EfficientNetLite + keypoint regressor
2D Object Detection	Google Objectron	AP at 10' Elevation error	0.8584	EfficientNetLite + keypoint regressor
2D Object Detection	Google Objectron	AP at 15' Azimuth error	0.7844	EfficientNetLite + keypoint regressor
2D Object Detection	Google Objectron	Average Precision at 0.5 3D IoU	0.6512	EfficientNetLite + keypoint regressor
2D Object Detection	Google Objectron	MPE	0.0467	EfficientNetLite + keypoint regressor
16k	Google Objectron	AP at 10' Elevation error	0.8584	EfficientNetLite + keypoint regressor
16k	Google Objectron	AP at 15' Azimuth error	0.7844	EfficientNetLite + keypoint regressor
16k	Google Objectron	Average Precision at 0.5 3D IoU	0.6512	EfficientNetLite + keypoint regressor
16k	Google Objectron	MPE	0.0467	EfficientNetLite + keypoint regressor

Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations

Abstract

Results

Related Papers

Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations

Abstract

Results

Related Papers