TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Integrated Object Detection and Tracking with Tracklet-Con...

Integrated Object Detection and Tracking with Tracklet-Conditioned Detection

Zheng Zhang, Dazhi Cheng, Xizhou Zhu, Stephen Lin, Jifeng Dai

2018-11-27Video Object DetectionVideo Understandingobject-detectionObject Detection
PaperPDF

Abstract

Accurate detection and tracking of objects is vital for effective video understanding. In previous work, the two tasks have been combined in a way that tracking is based heavily on detection, but the detection benefits marginally from the tracking. To increase synergy, we propose to more tightly integrate the tasks by conditioning the object detection in the current frame on tracklets computed in prior frames. With this approach, the object detection results not only have high detection responses, but also improved coherence with the existing tracklets. This greater coherence leads to estimated object trajectories that are smoother and more stable than the jittered paths obtained without tracklet-conditioned detection. Over extensive experiments, this approach is shown to achieve state-of-the-art performance in terms of both detection and tracking accuracy, as well as noticeable improvements in tracking stability.

Results

TaskDatasetMetricValueModel
Object DetectionImageNet VIDMAP 83.5Tracklet-Conditioned Detection+DCNv2+FGFA
3DImageNet VIDMAP 83.5Tracklet-Conditioned Detection+DCNv2+FGFA
2D ClassificationImageNet VIDMAP 83.5Tracklet-Conditioned Detection+DCNv2+FGFA
2D Object DetectionImageNet VIDMAP 83.5Tracklet-Conditioned Detection+DCNv2+FGFA
16kImageNet VIDMAP 83.5Tracklet-Conditioned Detection+DCNv2+FGFA

Related Papers

VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks2025-07-15Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15