TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Transformer Tracking

Transformer Tracking

Xin Chen, Bin Yan, Jiawen Zhu, Dong Wang, Xiaoyun Yang, Huchuan Lu

2021-03-29CVPR 2021 1Visual Object TrackingVisual TrackingObject TrackingVideo Object Tracking
PaperPDFCode(official)

Abstract

Correlation acts as a critical role in the tracking field, especially in recent popular Siamese-based trackers. The correlation operation is a simple fusion manner to consider the similarity between the template and the search region. However, the correlation operation itself is a local linear matching process, leading to lose semantic information and fall into local optimum easily, which may be the bottleneck of designing high-accuracy tracking algorithms. Is there any better feature fusion method than correlation? To address this issue, inspired by Transformer, this work presents a novel attention-based feature fusion network, which effectively combines the template and search region features solely using attention. Specifically, the proposed method includes an ego-context augment module based on self-attention and a cross-feature augment module based on cross-attention. Finally, we present a Transformer tracking (named TransT) method based on the Siamese-like feature extraction backbone, the designed attention-based fusion mechanism, and the classification and regression head. Experiments show that our TransT achieves very promising results on six challenging datasets, especially on large-scale LaSOT, TrackingNet, and GOT-10k benchmarks. Our tracker runs at approximatively 50 fps on GPU. Code and models are available at https://github.com/chenxin-dlut/TransT.

Results

TaskDatasetMetricValueModel
VideoNT-VOT211AUC36.79TransT
VideoNT-VOT211Precision51.97TransT
Visual TrackingTNL2KAUC50.7TransT
Object TrackingCOESOTPrecision Rate67.9TransT
Object TrackingCOESOTSuccess Rate60.5TransT
Object TrackingLaSOTAUC64.9TransT
Object TrackingLaSOTNormalized Precision73.8TransT
Object TrackingLaSOTPrecision69TransT
Object TrackingDiDiTracking quality0.465TransT
Object TrackingAVisTSuccess Rate49.03TransT
Object TrackingNT-VOT211AUC36.79TransT
Object TrackingNT-VOT211Precision51.97TransT
Visual Object TrackingLaSOTAUC64.9TransT
Visual Object TrackingLaSOTNormalized Precision73.8TransT
Visual Object TrackingLaSOTPrecision69TransT
Visual Object TrackingDiDiTracking quality0.465TransT
Visual Object TrackingAVisTSuccess Rate49.03TransT

Related Papers

MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results2025-07-17YOLOv8-SMOT: An Efficient and Robust Framework for Real-Time Small Object Tracking via Slice-Assisted Training and Adaptive Association2025-07-16HiM2SAM: Enhancing SAM2 with Hierarchical Motion Estimation and Memory Optimization towards Long-term Tracking2025-07-10What You Have is What You Track: Adaptive and Robust Multimodal Tracking2025-07-08Robustifying 3D Perception through Least-Squares Multi-Agent Graphs Object Tracking2025-07-07UMDATrack: Unified Multi-Domain Adaptive Tracking Under Adverse Weather Conditions2025-07-01Mamba-FETrack V2: Revisiting State Space Model for Frame-Event based Visual Object Tracking2025-06-30Visual and Memory Dual Adapter for Multi-Modal Object Tracking2025-06-30