Linjie Deng, Yanxiang Gong, Yi Lin, Jingwen Shuai, Xiaoguang Tu, Yuefei Zhang, Zheng Ma, Mei Xie
Previous approaches for scene text detection usually rely on manually defined sliding windows. This work presents an intuitive two-stage region-based method to detect multi-oriented text without any prior knowledge regarding the textual shape. In the first stage, we estimate the possible locations of text instances by detecting and linking corners instead of shifting a set of default anchors. The quadrilateral proposals are geometry adaptive, which allows our method to cope with various text aspect ratios and orientations. In the second stage, we design a new pooling layer named Dual-RoI Pooling which embeds data augmentation inside the region-wise subnetwork for more robust classification and regression over these proposals. Experimental results on public benchmarks confirm that the proposed method is capable of achieving comparable performance with state-of-the-art methods. The code is publicly available at https://github.com/xhzdeng/crpn
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Scene Text Detection | ICDAR 2013 | Precision | 91.9 | Corner-based Region Proposals |
| Scene Text Detection | ICDAR 2013 | Recall | 83.9 | Corner-based Region Proposals |
| Scene Text Detection | ICDAR 2015 | F-Measure | 84.5 | Corner-based Region Proposals |
| Scene Text Detection | ICDAR 2015 | Precision | 88.7 | Corner-based Region Proposals |
| Scene Text Detection | ICDAR 2015 | Recall | 80.7 | Corner-based Region Proposals |
| Scene Text Detection | COCO-Text | F-Measure | 59.1 | Corner-based Region Proposals |
| Scene Text Detection | COCO-Text | Precision | 55.5 | Corner-based Region Proposals |
| Scene Text Detection | COCO-Text | Recall | 63.3 | Corner-based Region Proposals |