CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Yuhong Li, Xiaofan Zhang, Deming Chen

2018-02-27CVPR 2018 6Crowd Counting Scene Recognition

Paper PDF Code Code Code Code Code Code Code Code Code Code Code

Abstract

We propose a network for Congested Scene Recognition called CSRNet to provide a data-driven and deep learning method that can understand highly congested scenes and perform accurate count estimation as well as present high-quality density maps. The proposed CSRNet is composed of two major components: a convolutional neural network (CNN) as the front-end for 2D feature extraction and a dilated CNN for the back-end, which uses dilated kernels to deliver larger reception fields and to replace pooling operations. CSRNet is an easy-trained model because of its pure convolutional structure. We demonstrate CSRNet on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the WorldEXPO'10 dataset, and the UCSD dataset) and we deliver the state-of-the-art performance. In the ShanghaiTech Part_B dataset, CSRNet achieves 47.3% lower Mean Absolute Error (MAE) than the previous state-of-the-art method. We extend the targeted applications for counting other objects, such as the vehicle in TRANCOS dataset. Results show that CSRNet significantly improves the output quality with 15.4% lower MAE than the previous state-of-the-art approach.

Results

Task	Dataset	Metric	Value	Model
Crowds	ShanghaiTech B	MAE	10.6	CSRNet
Crowds	TRANCOS	MAE	3.56	CSRNet
Crowds	ShanghaiTech A	MAE	68.2	CSRNet
Crowds	Venice	MAE	35.8	CSRNet
Crowds	UCF CC 50	MAE	266.1	CSRNet
Crowds	WorldExpo’10	Average MAE	8.6	CSRNet

Related Papers

Car Object Counting and Position Estimation via Extension of the CLIP-EBC Framework2025-07-11 EBC-ZIP: Improving Blockwise Crowd Counting with Zero-Inflated Poisson Regression2025-06-24 Point-to-Region Loss for Semi-Supervised Point-Based Crowd Counting2025-05-28 Crowd Scene Analysis using Deep Learning Techniques2025-05-13 Transformer-Based Dual-Optical Attention Fusion Crowd Head Point Counting and Localization Network2025-05-11 A Short Overview of Multi-Modal Wi-Fi Sensing2025-05-10 Adept: Annotation-Denoising Auxiliary Tasks with Discrete Cosine Transform Map and Keypoint for Human-Centric Pretraining2025-04-29 ProgRoCC: A Progressive Approach to Rough Crowd Counting2025-04-18