Bum Jun Kim, Sang Woo Kim
Regularization of deep neural networks has been an important issue to achieve higher generalization performance without overfitting problems. Although the popular method of Dropout provides a regularization effect, it causes inconsistent properties in the output, which may degrade the performance of deep neural networks. In this study, we propose a new module called stochastic average pooling, which incorporates Dropout-like stochasticity in pooling. We describe the properties of stochastic subsampling and average pooling and leverage them to design a module without any inconsistency problem. The stochastic average pooling achieves a regularization effect without any potential performance degradation due to the inconsistency issue and can easily be plugged into existing architectures of deep neural networks. Experiments demonstrate that replacing existing average pooling with stochastic average pooling yields consistent improvements across a variety of tasks, datasets, and models.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | ISPRS Vaihingen | Category mIoU | 73.27 | UPerNet (SAP) |
| Semantic Segmentation | ISPRS Vaihingen | Overall Accuracy | 90.14 | UPerNet (SAP) |
| Semantic Segmentation | ISPRS Potsdam | Mean IoU | 74.3 | PSPNet (SAP) |
| Semantic Segmentation | ISPRS Potsdam | Overall Accuracy | 88.56 | PSPNet (SAP) |
| Object Detection | COCO 2017 | AP | 42.1 | DyHead (SAP) |
| Object Detection | COCO 2017 | AP50 | 59.4 | DyHead (SAP) |
| Object Detection | COCO 2017 | AP75 | 45.9 | DyHead (SAP) |
| Image Classification | Stanford Cars | Accuracy | 85.812 | SE-ResNet-101 (SAP) |
| Image Classification | CIFAR-10 | Percentage correct | 93.861 | ResNet-110 (SAP) |
| Image Classification | CIFAR-100 | Percentage correct | 72.537 | ResNet-110 (SAP) |
| Image Classification | Oxford-IIIT Pets | Accuracy | 86.011 | SE-ResNet-101 (SAP) |
| 3D | COCO 2017 | AP | 42.1 | DyHead (SAP) |
| 3D | COCO 2017 | AP50 | 59.4 | DyHead (SAP) |
| 3D | COCO 2017 | AP75 | 45.9 | DyHead (SAP) |
| Fine-Grained Image Classification | Oxford-IIIT Pets | Accuracy | 86.011 | SE-ResNet-101 (SAP) |
| 2D Classification | COCO 2017 | AP | 42.1 | DyHead (SAP) |
| 2D Classification | COCO 2017 | AP50 | 59.4 | DyHead (SAP) |
| 2D Classification | COCO 2017 | AP75 | 45.9 | DyHead (SAP) |
| 2D Object Detection | COCO 2017 | AP | 42.1 | DyHead (SAP) |
| 2D Object Detection | COCO 2017 | AP50 | 59.4 | DyHead (SAP) |
| 2D Object Detection | COCO 2017 | AP75 | 45.9 | DyHead (SAP) |
| 10-shot image generation | ISPRS Vaihingen | Category mIoU | 73.27 | UPerNet (SAP) |
| 10-shot image generation | ISPRS Vaihingen | Overall Accuracy | 90.14 | UPerNet (SAP) |
| 10-shot image generation | ISPRS Potsdam | Mean IoU | 74.3 | PSPNet (SAP) |
| 10-shot image generation | ISPRS Potsdam | Overall Accuracy | 88.56 | PSPNet (SAP) |
| 16k | COCO 2017 | AP | 42.1 | DyHead (SAP) |
| 16k | COCO 2017 | AP50 | 59.4 | DyHead (SAP) |
| 16k | COCO 2017 | AP75 | 45.9 | DyHead (SAP) |