TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Siam R-CNN: Visual Tracking by Re-Detection

Siam R-CNN: Visual Tracking by Re-Detection

Paul Voigtlaender, Jonathon Luiten, Philip H. S. Torr, Bastian Leibe

2019-11-28CVPR 2020 6Visual Object TrackingSemi-Supervised Video Object SegmentationVisual TrackingObject Trackingobject-detectionObject Detection
PaperPDFCode

Abstract

We present Siam R-CNN, a Siamese re-detection architecture which unleashes the full power of two-stage object detection approaches for visual object tracking. We combine this with a novel tracklet-based dynamic programming algorithm, which takes advantage of re-detections of both the first-frame template and previous-frame predictions, to model the full history of both the object to be tracked and potential distractor objects. This enables our approach to make better tracking decisions, as well as to re-detect tracked objects after long occlusion. Finally, we propose a novel hard example mining strategy to improve Siam R-CNN's robustness to similar looking objects. Siam R-CNN achieves the current best performance on ten tracking benchmarks, with especially strong results for long-term tracking. We make our code and models available at www.vision.rwth-aachen.de/page/siamrcnn.

Results

TaskDatasetMetricValueModel
VideoDAVIS 2017 (val)F-measure (Decay)16.2Siam R-CNN
VideoDAVIS 2017 (val)F-measure (Mean)75Siam R-CNN
VideoDAVIS 2017 (val)F-measure (Recall)82.8Siam R-CNN
VideoDAVIS 2017 (val)J&F70.55Siam R-CNN
VideoDAVIS 2017 (val)Jaccard (Decay)15.8Siam R-CNN
VideoDAVIS 2017 (val)Jaccard (Mean)66.1Siam R-CNN
VideoDAVIS 2017 (val)Jaccard (Recall)74.8Siam R-CNN
VideoDAVIS 2016F-measure (Decay)4Siam R-CNN
VideoDAVIS 2016F-measure (Mean)80.4Siam R-CNN
VideoDAVIS 2016F-measure (Recall)87.6Siam R-CNN
VideoDAVIS 2016J&F78.6Siam R-CNN
VideoDAVIS 2016Jaccard (Decay)2.2Siam R-CNN
VideoDAVIS 2016Jaccard (Mean)76.8Siam R-CNN
VideoDAVIS 2016Jaccard (Recall)86.4Siam R-CNN
VideoDAVIS 2017 (test-dev)F-measure (Decay)20.2Siam R-CNN
VideoDAVIS 2017 (test-dev)F-measure (Mean)58.6Siam R-CNN
VideoDAVIS 2017 (test-dev)F-measure (Recall)62.3Siam R-CNN
VideoDAVIS 2017 (test-dev)J&F53.3Siam R-CNN
VideoDAVIS 2017 (test-dev)Jaccard (Decay)21.8Siam R-CNN
VideoDAVIS 2017 (test-dev)Jaccard (Mean)48Siam R-CNN
VideoDAVIS 2017 (test-dev)Jaccard (Recall)53.9Siam R-CNN
Object TrackingCOESOTPrecision Rate67.5SiamR-CNN
Object TrackingCOESOTSuccess Rate60.9SiamR-CNN
Object TrackingLaSOTAUC64.8Siam R-CNN
Object TrackingLaSOTNormalized Precision72.2Siam R-CNN
Object TrackingGOT-10kAverage Overlap64.9Siam R-CNN
Object TrackingGOT-10kSuccess Rate 0.572.8Siam R-CNN
Object TrackingTrackingNetAccuracy81.2Siam R-CNN
Object TrackingTrackingNetNormalized Precision85.4Siam R-CNN
Object TrackingTrackingNetPrecision80Siam R-CNN
Video Object SegmentationDAVIS 2017 (val)F-measure (Decay)16.2Siam R-CNN
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)75Siam R-CNN
Video Object SegmentationDAVIS 2017 (val)F-measure (Recall)82.8Siam R-CNN
Video Object SegmentationDAVIS 2017 (val)J&F70.55Siam R-CNN
Video Object SegmentationDAVIS 2017 (val)Jaccard (Decay)15.8Siam R-CNN
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)66.1Siam R-CNN
Video Object SegmentationDAVIS 2017 (val)Jaccard (Recall)74.8Siam R-CNN
Video Object SegmentationDAVIS 2016F-measure (Decay)4Siam R-CNN
Video Object SegmentationDAVIS 2016F-measure (Mean)80.4Siam R-CNN
Video Object SegmentationDAVIS 2016F-measure (Recall)87.6Siam R-CNN
Video Object SegmentationDAVIS 2016J&F78.6Siam R-CNN
Video Object SegmentationDAVIS 2016Jaccard (Decay)2.2Siam R-CNN
Video Object SegmentationDAVIS 2016Jaccard (Mean)76.8Siam R-CNN
Video Object SegmentationDAVIS 2016Jaccard (Recall)86.4Siam R-CNN
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Decay)20.2Siam R-CNN
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)58.6Siam R-CNN
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Recall)62.3Siam R-CNN
Video Object SegmentationDAVIS 2017 (test-dev)J&F53.3Siam R-CNN
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Decay)21.8Siam R-CNN
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)48Siam R-CNN
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Recall)53.9Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Decay)16.2Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)75Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Recall)82.8Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F70.55Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Decay)15.8Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)66.1Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Recall)74.8Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Decay)4Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)80.4Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Recall)87.6Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2016J&F78.6Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Decay)2.2Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)76.8Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Recall)86.4Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Decay)20.2Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)58.6Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Recall)62.3Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F53.3Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Decay)21.8Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)48Siam R-CNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Recall)53.9Siam R-CNN
Visual Object TrackingLaSOTAUC64.8Siam R-CNN
Visual Object TrackingLaSOTNormalized Precision72.2Siam R-CNN
Visual Object TrackingGOT-10kAverage Overlap64.9Siam R-CNN
Visual Object TrackingGOT-10kSuccess Rate 0.572.8Siam R-CNN
Visual Object TrackingTrackingNetAccuracy81.2Siam R-CNN
Visual Object TrackingTrackingNetNormalized Precision85.4Siam R-CNN
Visual Object TrackingTrackingNetPrecision80Siam R-CNN

Related Papers

MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17YOLOv8-SMOT: An Efficient and Robust Framework for Real-Time Small Object Tracking via Slice-Assisted Training and Adaptive Association2025-07-16Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15