Guangcong Wang, Jian-Huang Lai, Peigen Huang, Xiaohua Xie
Most of current person re-identification (ReID) methods neglect a spatial-temporal constraint. Given a query image, conventional methods compute the feature distances between the query image and all the gallery images and return a similarity ranked table. When the gallery database is very large in practice, these approaches fail to obtain a good performance due to appearance ambiguity across different camera views. In this paper, we propose a novel two-stream spatial-temporal person ReID (st-ReID) framework that mines both visual semantic information and spatial-temporal information. To this end, a joint similarity metric with Logistic Smoothing (LS) is introduced to integrate two kinds of heterogeneous information into a unified framework. To approximate a complex spatial-temporal probability distribution, we develop a fast Histogram-Parzen (HP) method. With the help of the spatial-temporal constraint, the st-ReID model eliminates lots of irrelevant images and thus narrows the gallery database. Without bells and whistles, our st-ReID method achieves rank-1 accuracy of 98.1\% on Market-1501 and 94.4\% on DukeMTMC-reID, improving from the baselines 91.2\% and 83.8\%, respectively, outperforming all previous state-of-the-art methods by a large margin.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Person Re-Identification | Market-1501 | Rank-1 | 98 | st-ReID(RE, RK) |
| Person Re-Identification | Market-1501 | Rank-5 | 98.9 | st-ReID(RE, RK) |
| Person Re-Identification | Market-1501 | mAP | 95.5 | st-ReID(RE, RK) |
| Person Re-Identification | DukeMTMC-reID | Rank-1 | 94.5 | st-ReID(RE, RK,Cam) |
| Person Re-Identification | DukeMTMC-reID | mAP | 92.7 | st-ReID(RE, RK,Cam) |