Zimeng Fang, Chao Liang, Xue Zhou, Shuyuan Zhu, Xi Li
Multi-object tracking (MOT) emerges as a pivotal and highly promising branch in the field of computer vision. Classical closed-vocabulary MOT (CV-MOT) methods aim to track objects of predefined categories. Recently, some open-vocabulary MOT (OV-MOT) methods have successfully addressed the problem of tracking unknown categories. However, we found that the CV-MOT and OV-MOT methods each struggle to excel in the tasks of the other. In this paper, we present a unified framework, Associate Everything Detected (AED), that simultaneously tackles CV-MOT and OV-MOT by integrating with any off-the-shelf detector and supports unknown categories. Different from existing tracking-by-detection MOT methods, AED gets rid of prior knowledge (e.g. motion cues) and relies solely on highly robust feature learning to handle complex trajectories in OV-MOT tasks while keeping excellent performance in CV-MOT tasks. Specifically, we model the association task as a similarity decoding problem and propose a sim-decoder with an association-centric learning mechanism. The sim-decoder calculates similarities in three aspects: spatial, temporal, and cross-clip. Subsequently, association-centric learning leverages these threefold similarities to ensure that the extracted features are appropriate for continuous tracking and robust enough to generalize to unknown categories. Compared with existing powerful OV-MOT and CV-MOT methods, AED achieves superior performance on TAO, SportsMOT, and DanceTrack without any prior knowledge. Our code is available at https://github.com/balabooooo/AED.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Video | SportsMOT | AssA | 70.1 | AED |
| Video | SportsMOT | DetA | 89.4 | AED |
| Video | SportsMOT | HOTA | 79.1 | AED |
| Video | SportsMOT | IDF1 | 81.8 | AED |
| Video | SportsMOT | MOTA | 97.1 | AED |
| Multi-Object Tracking | TAO | AssocA | 52.4 | AED (Co-DETR) |
| Multi-Object Tracking | TAO | ClsA | 41.7 | AED (Co-DETR) |
| Multi-Object Tracking | TAO | LocA | 71.8 | AED (Co-DETR) |
| Multi-Object Tracking | TAO | TETA | 55.3 | AED (Co-DETR) |
| Multi-Object Tracking | TAO | AssocA | 38.1 | AED (RegionCLIP) |
| Multi-Object Tracking | TAO | ClsA | 16.2 | AED (RegionCLIP) |
| Multi-Object Tracking | TAO | LocA | 56.7 | AED (RegionCLIP) |
| Multi-Object Tracking | TAO | TETA | 37 | AED (RegionCLIP) |
| Multi-Object Tracking | DanceTrack | AssA | 54.3 | AED |
| Multi-Object Tracking | DanceTrack | DetA | 82 | AED |
| Multi-Object Tracking | DanceTrack | HOTA | 66.6 | AED |
| Multi-Object Tracking | DanceTrack | IDF1 | 69.7 | AED |
| Multi-Object Tracking | DanceTrack | MOTA | 92.2 | AED |
| Multi-Object Tracking | SportsMOT | AssA | 70.1 | AED |
| Multi-Object Tracking | SportsMOT | DetA | 89.4 | AED |
| Multi-Object Tracking | SportsMOT | HOTA | 79.1 | AED |
| Multi-Object Tracking | SportsMOT | IDF1 | 81.8 | AED |
| Multi-Object Tracking | SportsMOT | MOTA | 97.1 | AED |
| Object Tracking | TAO | AssocA | 52.4 | AED (Co-DETR) |
| Object Tracking | TAO | ClsA | 41.7 | AED (Co-DETR) |
| Object Tracking | TAO | LocA | 71.8 | AED (Co-DETR) |
| Object Tracking | TAO | TETA | 55.3 | AED (Co-DETR) |
| Object Tracking | TAO | AssocA | 38.1 | AED (RegionCLIP) |
| Object Tracking | TAO | ClsA | 16.2 | AED (RegionCLIP) |
| Object Tracking | TAO | LocA | 56.7 | AED (RegionCLIP) |
| Object Tracking | TAO | TETA | 37 | AED (RegionCLIP) |
| Object Tracking | DanceTrack | AssA | 54.3 | AED |
| Object Tracking | DanceTrack | DetA | 82 | AED |
| Object Tracking | DanceTrack | HOTA | 66.6 | AED |
| Object Tracking | DanceTrack | IDF1 | 69.7 | AED |
| Object Tracking | DanceTrack | MOTA | 92.2 | AED |
| Object Tracking | SportsMOT | AssA | 70.1 | AED |
| Object Tracking | SportsMOT | DetA | 89.4 | AED |
| Object Tracking | SportsMOT | HOTA | 79.1 | AED |
| Object Tracking | SportsMOT | IDF1 | 81.8 | AED |
| Object Tracking | SportsMOT | MOTA | 97.1 | AED |
| Object Tracking | SportsMOT | AssA | 70.1 | AED |
| Object Tracking | SportsMOT | DetA | 89.4 | AED |
| Object Tracking | SportsMOT | HOTA | 79.1 | AED |
| Object Tracking | SportsMOT | IDF1 | 81.8 | AED |
| Object Tracking | SportsMOT | MOTA | 97.1 | AED |
| Multiple Object Tracking | SportsMOT | AssA | 70.1 | AED |
| Multiple Object Tracking | SportsMOT | DetA | 89.4 | AED |
| Multiple Object Tracking | SportsMOT | HOTA | 79.1 | AED |
| Multiple Object Tracking | SportsMOT | IDF1 | 81.8 | AED |
| Multiple Object Tracking | SportsMOT | MOTA | 97.1 | AED |