TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Objects do not disappear: Video object detection by single...

Objects do not disappear: Video object detection by single-frame object location anticipation

Xin Liu, Fatemeh Karimi Nejadasl, Jan C. van Gemert, Olaf Booij, Silvia L. Pintea

2023-08-09ICCV 2023 1Video Object Detectionobject-detectionObject Detection
PaperPDFCode(official)Code

Abstract

Objects in videos are typically characterized by continuous smooth motion. We exploit continuous smooth motion in three ways. 1) Improved accuracy by using object motion as an additional source of supervision, which we obtain by anticipating object locations from a static keyframe. 2) Improved efficiency by only doing the expensive feature computations on a small subset of all frames. Because neighboring video frames are often redundant, we only compute features for a single static keyframe and predict object locations in subsequent frames. 3) Reduced annotation cost, where we only annotate the keyframe and use smooth pseudo-motion between keyframes. We demonstrate computational efficiency, annotation efficiency, and improved mean average precision compared to the state-of-the-art on four datasets: ImageNet VID, EPIC KITCHENS-55, YouTube-BoundingBoxes, and Waymo Open dataset. Our source code is available at https://github.com/L-KID/Videoobject-detection-by-location-anticipation.

Results

TaskDatasetMetricValueModel
Object DetectionEPIC-KITCHENS-55mAP@.541.7Ours (Faster RCNN)
Object DetectionWaymo Open DatasetAP59.28
Object DetectionImageNet VIDMAP 91.3Ours (Def. DETR + SwinB)
Object DetectionImageNet VIDMAP 87.9Ours (Def. DETR + R101)
Object DetectionImageNet VIDMAP 87.2Ours (Faster RCNN + R101)
Object DetectionYT-BBmAP59.8
3DEPIC-KITCHENS-55mAP@.541.7Ours (Faster RCNN)
3DWaymo Open DatasetAP59.28
3DImageNet VIDMAP 91.3Ours (Def. DETR + SwinB)
3DImageNet VIDMAP 87.9Ours (Def. DETR + R101)
3DImageNet VIDMAP 87.2Ours (Faster RCNN + R101)
3DYT-BBmAP59.8
2D ClassificationEPIC-KITCHENS-55mAP@.541.7Ours (Faster RCNN)
2D ClassificationWaymo Open DatasetAP59.28
2D ClassificationImageNet VIDMAP 91.3Ours (Def. DETR + SwinB)
2D ClassificationImageNet VIDMAP 87.9Ours (Def. DETR + R101)
2D ClassificationImageNet VIDMAP 87.2Ours (Faster RCNN + R101)
2D ClassificationYT-BBmAP59.8
2D Object DetectionEPIC-KITCHENS-55mAP@.541.7Ours (Faster RCNN)
2D Object DetectionWaymo Open DatasetAP59.28
2D Object DetectionImageNet VIDMAP 91.3Ours (Def. DETR + SwinB)
2D Object DetectionImageNet VIDMAP 87.9Ours (Def. DETR + R101)
2D Object DetectionImageNet VIDMAP 87.2Ours (Faster RCNN + R101)
2D Object DetectionYT-BBmAP59.8
16kEPIC-KITCHENS-55mAP@.541.7Ours (Faster RCNN)
16kWaymo Open DatasetAP59.28
16kImageNet VIDMAP 91.3Ours (Def. DETR + SwinB)
16kImageNet VIDMAP 87.9Ours (Def. DETR + R101)
16kImageNet VIDMAP 87.2Ours (Faster RCNN + R101)
16kYT-BBmAP59.8

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations2025-07-07