TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Is Synthetic Dataset Reliable for Benchmarking Generalizab...

Is Synthetic Dataset Reliable for Benchmarking Generalizable Person Re-Identification?

Cuicui Kang

2022-09-12BenchmarkingPerson Re-IdentificationGeneralizable Person Re-identification
PaperPDFCode

Abstract

Recent studies show that models trained on synthetic datasets are able to achieve better generalizable person re-identification (GPReID) performance than that trained on public real-world datasets. On the other hand, due to the limitations of real-world person ReID datasets, it would also be important and interesting to use large-scale synthetic datasets as test sets to benchmark person ReID algorithms. Yet this raises a critical question: is synthetic dataset reliable for benchmarking generalizable person re-identification? In the literature there is no evidence showing this. To address this, we design a method called Pairwise Ranking Analysis (PRA) to quantitatively measure the ranking similarity and perform the statistical test of identical distributions. Specifically, we employ Kendall rank correlation coefficients to evaluate pairwise similarity values between algorithm rankings on different datasets. Then, a non-parametric two-sample Kolmogorov-Smirnov (KS) test is performed for the judgement of whether algorithm ranking correlations between synthetic and real-world datasets and those only between real-world datasets lie in identical distributions. We conduct comprehensive experiments, with ten representative algorithms, three popular real-world person ReID datasets, and three recently released large-scale synthetic datasets. Through the designed pairwise ranking analysis and comprehensive evaluations, we conclude that a recent large-scale synthetic dataset ClonedPerson can be reliably used to benchmark GPReID, statistically the same as real-world datasets. Therefore, this study guarantees the usage of synthetic datasets for both source training set and target testing set, with completely no privacy concerns from real-world surveillance data. Besides, the study in this paper might also inspire future designs of synthetic datasets.

Results

TaskDatasetMetricValueModel
Person Re-IdentificationClonedPersonMSMT17->Rank-151.8TransMatcher
Person Re-IdentificationClonedPersonMSMT17->mAP9TransMatcher
Person Re-IdentificationClonedPersonMarket-1501->Rank-150.1TransMatcher
Person Re-IdentificationClonedPersonMarket-1501->mAP9.2TransMatcher
Person Re-IdentificationClonedPersonRandPerson->Rank-167.8TransMatcher
Person Re-IdentificationClonedPersonRandPerson->mAP22.1TransMatcher

Related Papers

Visual Place Recognition for Large-Scale UAV Applications2025-07-20Training Transformers with Enforced Lipschitz Constants2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Weakly Supervised Visible-Infrared Person Re-Identification via Heterogeneous Expert Collaborative Consistency Learning2025-07-17WhoFi: Deep Person Re-Identification via Wi-Fi Channel Signal Encoding2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15