TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Towards Sequence-Level Training for Visual Tracking

Towards Sequence-Level Training for Visual Tracking

Minji Kim, Seungkwan Lee, Jungseul Ok, Bohyung Han, Minsu Cho

2022-08-11Visual Object TrackingReinforcement LearningData AugmentationObject TrackingVideo Object Tracking
PaperPDFCode(official)Code

Abstract

Despite the extensive adoption of machine learning on the task of visual object tracking, recent learning-based approaches have largely overlooked the fact that visual tracking is a sequence-level task in its nature; they rely heavily on frame-level training, which inevitably induces inconsistency between training and testing in terms of both data distributions and task objectives. This work introduces a sequence-level training strategy for visual tracking based on reinforcement learning and discusses how a sequence-level design of data sampling, learning objectives, and data augmentation can improve the accuracy and robustness of tracking algorithms. Our experiments on standard benchmarks including LaSOT, TrackingNet, and GOT-10k demonstrate that four representative tracking models, SiamRPN++, SiamAttn, TransT, and TrDiMP, consistently improve by incorporating the proposed methods in training without modifying architectures.

Results

TaskDatasetMetricValueModel
VideoNT-VOT211AUC37.22SLT-TransT
VideoNT-VOT211Precision51.7SLT-TransT
Object TrackingLaSOTAUC66.8SLT-TransT
Object TrackingLaSOTNormalized Precision75.5SLT-TransT
Object TrackingGOT-10kAverage Overlap67.5SLT-TransT
Object TrackingGOT-10kSuccess Rate 0.576.8SLT-TransT
Object TrackingGOT-10kSuccess Rate 0.7560.3SLT-TransT
Object TrackingTrackingNetAccuracy82.8SLT-TransT
Object TrackingTrackingNetNormalized Precision87.5SLT-TransT
Object TrackingTrackingNetPrecision81.4SLT-TransT
Object TrackingNT-VOT211AUC37.22SLT-TransT
Object TrackingNT-VOT211Precision51.7SLT-TransT
Visual Object TrackingLaSOTAUC66.8SLT-TransT
Visual Object TrackingLaSOTNormalized Precision75.5SLT-TransT
Visual Object TrackingGOT-10kAverage Overlap67.5SLT-TransT
Visual Object TrackingGOT-10kSuccess Rate 0.576.8SLT-TransT
Visual Object TrackingGOT-10kSuccess Rate 0.7560.3SLT-TransT
Visual Object TrackingTrackingNetAccuracy82.8SLT-TransT
Visual Object TrackingTrackingNetNormalized Precision87.5SLT-TransT
Visual Object TrackingTrackingNetPrecision81.4SLT-TransT

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Autonomous Resource Management in Microservice Systems via Reinforcement Learning2025-07-17