TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Tracking Pedestrian Heads in Dense Crowd

Tracking Pedestrian Heads in Dense Crowd

Ramana Sundararaman, Cedric De Almeida Braga, Eric Marchand, Julien Pettre

2021-03-24CVPR 2021 1Head DetectionScene UnderstandingMulti-Object TrackingMultiple Object Tracking
PaperPDFCodeCodeCodeCode(official)

Abstract

Tracking humans in crowded video sequences is an important constituent of visual scene understanding. Increasing crowd density challenges visibility of humans, limiting the scalability of existing pedestrian trackers to higher crowd densities. For that reason, we propose to revitalize head tracking with Crowd of Heads Dataset (CroHD), consisting of 9 sequences of 11,463 frames with over 2,276,838 heads and 5,230 tracks annotated in diverse scenes. For evaluation, we proposed a new metric, IDEucl, to measure an algorithm's efficacy in preserving a unique identity for the longest stretch in image coordinate space, thus building a correspondence between pedestrian crowd motion and the performance of a tracking algorithm. Moreover, we also propose a new head detector, HeadHunter, which is designed for small head detection in crowded scenes. We extend HeadHunter with a Particle Filter and a color histogram based re-identification module for head tracking. To establish this as a strong baseline, we compare our tracker with existing state-of-the-art pedestrian trackers on CroHD and demonstrate superiority, especially in identity preserving tracking metrics. With a light-weight head detector and a tracker which is efficient at identity preservation, we believe our contributions will serve useful in advancement of pedestrian tracking in dense crowds.

Results

TaskDatasetMetricValueModel
VideoCroHDIDEucl60.3HeadHunter-T
VideoCroHDIDF157.1HeadHunter-T
VideoCroHDIDs892HeadHunter-T
VideoCroHDML93HeadHunter-T
VideoCroHDMOTA63.6HeadHunter-T
VideoCroHDMT146HeadHunter-T
VideoCroHDIDEucl31.8Tracktor
VideoCroHDIDF138.5Tracktor
VideoCroHDIDs3474Tracktor
VideoCroHDML117Tracktor
VideoCroHDMOTA58.9Tracktor
VideoCroHDMT125Tracktor
VideoCroHDIDEucl58SORT
VideoCroHDIDF148.4SORT
VideoCroHDIDs649SORT
VideoCroHDML216SORT
VideoCroHDMOTA46.4SORT
VideoCroHDMT49SORT
Object TrackingCroHDIDEucl60.3HeadHunter-T
Object TrackingCroHDIDF157.1HeadHunter-T
Object TrackingCroHDIDs892HeadHunter-T
Object TrackingCroHDML93HeadHunter-T
Object TrackingCroHDMOTA63.6HeadHunter-T
Object TrackingCroHDMT146HeadHunter-T
Object TrackingCroHDIDEucl31.8Tracktor
Object TrackingCroHDIDF138.5Tracktor
Object TrackingCroHDIDs3474Tracktor
Object TrackingCroHDML117Tracktor
Object TrackingCroHDMOTA58.9Tracktor
Object TrackingCroHDMT125Tracktor
Object TrackingCroHDIDEucl58SORT
Object TrackingCroHDIDF148.4SORT
Object TrackingCroHDIDs649SORT
Object TrackingCroHDML216SORT
Object TrackingCroHDMOTA46.4SORT
Object TrackingCroHDMT49SORT
Multiple Object TrackingCroHDIDEucl60.3HeadHunter-T
Multiple Object TrackingCroHDIDF157.1HeadHunter-T
Multiple Object TrackingCroHDIDs892HeadHunter-T
Multiple Object TrackingCroHDML93HeadHunter-T
Multiple Object TrackingCroHDMOTA63.6HeadHunter-T
Multiple Object TrackingCroHDMT146HeadHunter-T
Multiple Object TrackingCroHDIDEucl31.8Tracktor
Multiple Object TrackingCroHDIDF138.5Tracktor
Multiple Object TrackingCroHDIDs3474Tracktor
Multiple Object TrackingCroHDML117Tracktor
Multiple Object TrackingCroHDMOTA58.9Tracktor
Multiple Object TrackingCroHDMT125Tracktor
Multiple Object TrackingCroHDIDEucl58SORT
Multiple Object TrackingCroHDIDF148.4SORT
Multiple Object TrackingCroHDIDs649SORT
Multiple Object TrackingCroHDML216SORT
Multiple Object TrackingCroHDMOTA46.4SORT
Multiple Object TrackingCroHDMT49SORT

Related Papers

Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results2025-07-17YOLOv8-SMOT: An Efficient and Robust Framework for Real-Time Small Object Tracking via Slice-Assisted Training and Adaptive Association2025-07-16Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation2025-07-15Tactical Decision for Multi-UGV Confrontation with a Vision-Language Model-Based Commander2025-07-15Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis2025-07-15