TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Multiview Detection with Feature Perspective Transformation

Multiview Detection with Feature Perspective Transformation

Yunzhong Hou, Liang Zheng, Stephen Gould

2020-07-14ECCV 2020 8Human DetectionMultiview DetectionPedestrian Detection
PaperPDFCodeCode(official)Code

Abstract

Incorporating multiple camera views for detection alleviates the impact of occlusions in crowded scenes. In a multiview system, we need to answer two important questions when dealing with ambiguities that arise from occlusions. First, how should we aggregate cues from the multiple views? Second, how should we aggregate unreliable 2D and 3D spatial information that has been tainted by occlusions? To address these questions, we propose a novel multiview detection system, MVDet. For multiview aggregation, existing methods combine anchor box features from the image plane, which potentially limits performance due to inaccurate anchor box shapes and sizes. In contrast, we take an anchor-free approach to aggregate multiview information by projecting feature maps onto the ground plane (bird's eye view). To resolve any remaining spatial ambiguity, we apply large kernel convolutions on the ground plane feature map and infer locations from detection peaks. Our entire model is end-to-end learnable and achieves 88.2% MODA on the standard Wildtrack dataset, outperforming the state-of-the-art by 14.1%. We also provide detailed analysis of MVDet on a newly introduced synthetic dataset, MultiviewX, which allows us to control the level of occlusion. Code and MultiviewX dataset are available at https://github.com/hou-yz/MVDet.

Results

TaskDatasetMetricValueModel
Object DetectionWildtrackMODA88.2MVDet
Object DetectionWildtrackMODP75.7MVDet
Object DetectionWildtrackRecall93.6MVDet
Object DetectionCityStreetF1_score (2m)68.4MVDet
Object DetectionCityStreetMODA (2m)44.6MVDet
Object DetectionCityStreetMODP (2m)65.7MVDet
Object DetectionCityStreetPrecision (2m)79.8MVDet
Object DetectionCityStreetRecall (2m)59.8MVDet
Object DetectionCVCSF1_score (1m)60.9MVDet
Object DetectionCVCSMODA (1m)36.6MVDet
Object DetectionCVCSMODP (1m)71MVDet
Object DetectionCVCSPrecision (1m)79.4MVDet
Object DetectionCVCSRecall (1m)49.4MVDet
Object DetectionMultiviewXMODA93.6MVDet
Object DetectionMultiviewXMODP79.6MVDet
Object DetectionMultiviewXRecall86.7MVDet
3DWildtrackMODA88.2MVDet
3DWildtrackMODP75.7MVDet
3DWildtrackRecall93.6MVDet
3DCityStreetF1_score (2m)68.4MVDet
3DCityStreetMODA (2m)44.6MVDet
3DCityStreetMODP (2m)65.7MVDet
3DCityStreetPrecision (2m)79.8MVDet
3DCityStreetRecall (2m)59.8MVDet
3DCVCSF1_score (1m)60.9MVDet
3DCVCSMODA (1m)36.6MVDet
3DCVCSMODP (1m)71MVDet
3DCVCSPrecision (1m)79.4MVDet
3DCVCSRecall (1m)49.4MVDet
3DMultiviewXMODA93.6MVDet
3DMultiviewXMODP79.6MVDet
3DMultiviewXRecall86.7MVDet
3D Object DetectionWildtrackMODA88.2MVDet
3D Object DetectionWildtrackMODP75.7MVDet
3D Object DetectionWildtrackRecall93.6MVDet
3D Object DetectionCityStreetF1_score (2m)68.4MVDet
3D Object DetectionCityStreetMODA (2m)44.6MVDet
3D Object DetectionCityStreetMODP (2m)65.7MVDet
3D Object DetectionCityStreetPrecision (2m)79.8MVDet
3D Object DetectionCityStreetRecall (2m)59.8MVDet
3D Object DetectionCVCSF1_score (1m)60.9MVDet
3D Object DetectionCVCSMODA (1m)36.6MVDet
3D Object DetectionCVCSMODP (1m)71MVDet
3D Object DetectionCVCSPrecision (1m)79.4MVDet
3D Object DetectionCVCSRecall (1m)49.4MVDet
3D Object DetectionMultiviewXMODA93.6MVDet
3D Object DetectionMultiviewXMODP79.6MVDet
3D Object DetectionMultiviewXRecall86.7MVDet
2D ClassificationWildtrackMODA88.2MVDet
2D ClassificationWildtrackMODP75.7MVDet
2D ClassificationWildtrackRecall93.6MVDet
2D ClassificationCityStreetF1_score (2m)68.4MVDet
2D ClassificationCityStreetMODA (2m)44.6MVDet
2D ClassificationCityStreetMODP (2m)65.7MVDet
2D ClassificationCityStreetPrecision (2m)79.8MVDet
2D ClassificationCityStreetRecall (2m)59.8MVDet
2D ClassificationCVCSF1_score (1m)60.9MVDet
2D ClassificationCVCSMODA (1m)36.6MVDet
2D ClassificationCVCSMODP (1m)71MVDet
2D ClassificationCVCSPrecision (1m)79.4MVDet
2D ClassificationCVCSRecall (1m)49.4MVDet
2D ClassificationMultiviewXMODA93.6MVDet
2D ClassificationMultiviewXMODP79.6MVDet
2D ClassificationMultiviewXRecall86.7MVDet
2D Object DetectionWildtrackMODA88.2MVDet
2D Object DetectionWildtrackMODP75.7MVDet
2D Object DetectionWildtrackRecall93.6MVDet
2D Object DetectionCityStreetF1_score (2m)68.4MVDet
2D Object DetectionCityStreetMODA (2m)44.6MVDet
2D Object DetectionCityStreetMODP (2m)65.7MVDet
2D Object DetectionCityStreetPrecision (2m)79.8MVDet
2D Object DetectionCityStreetRecall (2m)59.8MVDet
2D Object DetectionCVCSF1_score (1m)60.9MVDet
2D Object DetectionCVCSMODA (1m)36.6MVDet
2D Object DetectionCVCSMODP (1m)71MVDet
2D Object DetectionCVCSPrecision (1m)79.4MVDet
2D Object DetectionCVCSRecall (1m)49.4MVDet
2D Object DetectionMultiviewXMODA93.6MVDet
2D Object DetectionMultiviewXMODP79.6MVDet
2D Object DetectionMultiviewXRecall86.7MVDet
16kWildtrackMODA88.2MVDet
16kWildtrackMODP75.7MVDet
16kWildtrackRecall93.6MVDet
16kCityStreetF1_score (2m)68.4MVDet
16kCityStreetMODA (2m)44.6MVDet
16kCityStreetMODP (2m)65.7MVDet
16kCityStreetPrecision (2m)79.8MVDet
16kCityStreetRecall (2m)59.8MVDet
16kCVCSF1_score (1m)60.9MVDet
16kCVCSMODA (1m)36.6MVDet
16kCVCSMODP (1m)71MVDet
16kCVCSPrecision (1m)79.4MVDet
16kCVCSRecall (1m)49.4MVDet
16kMultiviewXMODA93.6MVDet
16kMultiviewXMODP79.6MVDet
16kMultiviewXRecall86.7MVDet

Related Papers

YOLO-APD: Enhancing YOLOv8 for Robust Pedestrian Detection on Complex Road Geometries2025-07-07TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent2025-05-26Distance Estimation in Outdoor Driving Environments Using Phase-only Correlation Method with Event Cameras2025-05-23R3GS: Gaussian Splatting for Robust Reconstruction and Relocalization in Unconstrained Image Collections2025-05-21DetReIDX: A Stress-Test Dataset for Real-World UAV-Based Person Recognition2025-05-07Person detection and re-identification in open-world settings of retail stores and public spaces2025-05-01Hardware/Software Co-Design of RISC-V Extensions for Accelerating Sparse DNNs on FPGAs2025-04-28Finding the Reflection Point: Unpadding Images to Remove Data Augmentation Artifacts in Large Open Source Image Datasets for Machine Learning2025-04-04