TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/TransCenter: Transformers with Dense Representations for M...

TransCenter: Transformers with Dense Representations for Multiple-Object Tracking

Yihong Xu, Yutong Ban, Guillaume Delorme, Chuang Gan, Daniela Rus, Xavier Alameda-Pineda

2021-03-28Image ClassificationMulti-Object TrackingObject TrackingMultiple Object Trackingobject-detectionObject Detection
PaperPDFCodeCode(official)

Abstract

Transformers have proven superior performance for a wide variety of tasks since they were introduced. In recent years, they have drawn attention from the vision community in tasks such as image classification and object detection. Despite this wave, an accurate and efficient multiple-object tracking (MOT) method based on transformers is yet to be designed. We argue that the direct application of a transformer architecture with quadratic complexity and insufficient noise-initialized sparse queries - is not optimal for MOT. We propose TransCenter, a transformer-based MOT architecture with dense representations for accurately tracking all the objects while keeping a reasonable runtime. Methodologically, we propose the use of image-related dense detection queries and efficient sparse tracking queries produced by our carefully designed query learning networks (QLN). On one hand, the dense image-related detection queries allow us to infer targets' locations globally and robustly through dense heatmap outputs. On the other hand, the set of sparse tracking queries efficiently interacts with image features in our TransCenter Decoder to associate object positions through time. As a result, TransCenter exhibits remarkable performance improvements and outperforms by a large margin the current state-of-the-art methods in two standard MOT benchmarks with two tracking settings (public/private). TransCenter is also proven efficient and accurate by an extensive ablation study and comparisons to more naive alternatives and concurrent works. For scientific interest, the code is made publicly available at https://github.com/yihongxu/transcenter.

Results

TaskDatasetMetricValueModel
Multi-Object TrackingMOT20IDF157.9TransCenter
Multi-Object TrackingMOT20MOTA72.4TransCenter
Multi-Object TrackingMOT17IDF165.6TransCenter
Multi-Object TrackingMOT17MOTA76TransCenter
Object TrackingMOT20IDF157.9TransCenter
Object TrackingMOT20MOTA72.4TransCenter
Object TrackingMOT17IDF165.6TransCenter
Object TrackingMOT17MOTA76TransCenter

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17