TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Do Different Tracking Tasks Require Different Appearance M...

Do Different Tracking Tasks Require Different Appearance Models?

Zhongdao Wang, Hengshuang Zhao, Ya-Li Li, Shengjin Wang, Philip H. S. Torr, Luca Bertinetto

2021-07-05NeurIPS 2021 12Visual Object TrackingSemi-Supervised Video Object SegmentationVisual TrackingMulti-Object Tracking and SegmentationMulti-Object TrackingPose EstimationVideo Object SegmentationObject TrackingPose TrackingPose PredictionOnline Multi-Object TrackingVideo Instance SegmentationVideo Object TrackingMultiple People Tracking
PaperPDFCode(official)

Abstract

Tracking objects of interest in a video is one of the most popular and widely applicable problems in computer vision. However, with the years, a Cambrian explosion of use cases and benchmarks has fragmented the problem in a multitude of different experimental setups. As a consequence, the literature has fragmented too, and now novel approaches proposed by the community are usually specialised to fit only one specific setup. To understand to what extent this specialisation is necessary, in this work we present UniTrack, a solution to address five different tasks within the same framework. UniTrack consists of a single and task-agnostic appearance model, which can be learned in a supervised or self-supervised fashion, and multiple ``heads'' that address individual tasks and do not require training. We show how most tracking tasks can be solved within this framework, and that the same appearance model can be successfully used to obtain results that are competitive against specialised methods for most of the tasks considered. The framework also allows us to analyse appearance models obtained with the most recent self-supervised methods, thus extending their evaluation and comparison to a larger variety of important problems.

Results

TaskDatasetMetricValueModel
VideoDAVIS 2017mIoU58.4UniTrack
Multi-Object TrackingMOTS20IDF167.2UniTrack
Multi-Object TrackingMOTS20IDs622UniTrack
Multi-Object TrackingMOTS20sMOTSA68.9UniTrack
Multi-Object TrackingMOT16IDF171.8UniTrack
Multi-Object TrackingMOT16IDs683UniTrack
Multi-Object TrackingMOT16MOTA74.7UniTrack
Pose EstimationJ-HMDBMean PCK@0.158.3UniTrack_i18
Pose EstimationJ-HMDBMean PCK@0.280.5UniTrack_i18
Object TrackingMOTS20IDF167.2UniTrack
Object TrackingMOTS20IDs622UniTrack
Object TrackingMOTS20sMOTSA68.9UniTrack
Object TrackingMOT16IDF171.8UniTrack
Object TrackingMOT16IDs683UniTrack
Object TrackingMOT16MOTA74.7UniTrack
Object TrackingOTB-2015AUC0.618UniTrack_DCF
3DJ-HMDBMean PCK@0.158.3UniTrack_i18
3DJ-HMDBMean PCK@0.280.5UniTrack_i18
Pose TrackingPoseTrack2018IDF173.2UniTrack
Pose TrackingPoseTrack2018IDs6760UniTrack
Pose TrackingPoseTrack2018MOTA63.5UniTrack
Video Object SegmentationDAVIS 2017mIoU58.4UniTrack
Video Instance SegmentationYouTube-VIS validationmask AP30.1UniTrack
Visual Object TrackingOTB-2015AUC0.618UniTrack_DCF
1 Image, 2*2 StitchiJ-HMDBMean PCK@0.158.3UniTrack_i18
1 Image, 2*2 StitchiJ-HMDBMean PCK@0.280.5UniTrack_i18

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17YOLOv8-SMOT: An Efficient and Robust Framework for Real-Time Small Object Tracking via Slice-Assisted Training and Adaptive Association2025-07-16