FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation

Zhe Chen, Jiahao Wang, Wenhai Wang, Guo Chen, Enze Xie, Ping Luo, Tong Lu

2021-11-03Image Classification Scene Text Detection Text Detection

Abstract

We propose an accurate and efficient scene text detection framework, termed FAST (i.e., faster arbitrarily-shaped text detector). Different from recent advanced text detectors that used complicated post-processing and hand-crafted network architectures, resulting in low inference speed, FAST has two new designs. (1) We design a minimalist kernel representation (only has 1-channel output) to model text with arbitrary shape, as well as a GPU-parallel post-processing to efficiently assemble text lines with a negligible time overhead. (2) We search the network architecture tailored for text detection, leading to more powerful features than most networks that are searched for image classification. Benefiting from these two designs, FAST achieves an excellent trade-off between accuracy and efficiency on several challenging datasets, including Total Text, CTW1500, ICDAR 2015, and MSRA-TD500. For example, FAST-T yields 81.6% F-measure at 152 FPS on Total-Text, outperforming the previous fastest method by 1.7 points and 70 FPS in terms of accuracy and speed. With TensorRT optimization, the inference speed can be further accelerated to over 600 FPS. Code and models will be released at https://github.com/czczup/FAST.

Results

Task	Dataset	Metric	Value	Model
Scene Text Detection	Total-Text	FPS	46	FAST-B-800
Scene Text Detection	Total-Text	Precision	90	FAST-B-800
Scene Text Detection	Total-Text	Recall	85.2	FAST-B-800
Scene Text Detection	Total-Text	FPS	67.5	FAST-B-640
Scene Text Detection	Total-Text	Precision	89.9	FAST-B-640
Scene Text Detection	Total-Text	Recall	83.2	FAST-B-640
Scene Text Detection	Total-Text	FPS	93.2	FAST-B-512
Scene Text Detection	Total-Text	Precision	89.6	FAST-B-512
Scene Text Detection	Total-Text	Recall	82.4	FAST-B-512
Scene Text Detection	Total-Text	FPS	115.5	FAST-S-512
Scene Text Detection	Total-Text	Precision	88.3	FAST-S-512
Scene Text Detection	Total-Text	Recall	81.7	FAST-S-512
Scene Text Detection	Total-Text	FPS	152.8	FAST-T-448
Scene Text Detection	Total-Text	Precision	86.5	FAST-T-448
Scene Text Detection	Total-Text	Recall	77.2	FAST-T-448
Scene Text Detection	SCUT-CTW1500	F-Measure	84.2	FAST-B-640
Scene Text Detection	SCUT-CTW1500	FPS	66.5	FAST-B-640
Scene Text Detection	SCUT-CTW1500	Precision	87.8	FAST-B-640
Scene Text Detection	SCUT-CTW1500	Recall	80.9	FAST-B-640
Scene Text Detection	SCUT-CTW1500	F-Measure	82.9	FAST-B-512
Scene Text Detection	SCUT-CTW1500	FPS	92.6	FAST-B-512
Scene Text Detection	SCUT-CTW1500	Precision	85.7	FAST-B-512
Scene Text Detection	SCUT-CTW1500	Recall	80.2	FAST-B-512
Scene Text Detection	SCUT-CTW1500	F-Measure	82	FAST-S-512
Scene Text Detection	SCUT-CTW1500	FPS	112.9	FAST-S-512
Scene Text Detection	SCUT-CTW1500	Precision	85.6	FAST-S-512
Scene Text Detection	SCUT-CTW1500	Recall	78.7	FAST-S-512
Scene Text Detection	SCUT-CTW1500	F-Measure	81.5	FAST-T-512
Scene Text Detection	SCUT-CTW1500	FPS	129.1	FAST-T-512
Scene Text Detection	SCUT-CTW1500	Precision	85.5	FAST-T-512
Scene Text Detection	SCUT-CTW1500	Recall	77.9	FAST-T-512
Scene Text Detection	ICDAR 2015	F-Measure	87.1	FAST-B-1280
Scene Text Detection	ICDAR 2015	FPS	15.7	FAST-B-1280
Scene Text Detection	ICDAR 2015	Precision	89.7	FAST-B-1280
Scene Text Detection	ICDAR 2015	Recall	84.6	FAST-B-1280
Scene Text Detection	ICDAR 2015	F-Measure	86.3	FAST-B-896
Scene Text Detection	ICDAR 2015	FPS	31.8	FAST-B-896
Scene Text Detection	ICDAR 2015	Precision	89.2	FAST-B-896
Scene Text Detection	ICDAR 2015	Recall	83.6	FAST-B-896
Scene Text Detection	ICDAR 2015	F-Measure	84.7	FAST-B-736
Scene Text Detection	ICDAR 2015	FPS	42.7	FAST-B-736
Scene Text Detection	ICDAR 2015	Precision	88	FAST-B-736
Scene Text Detection	ICDAR 2015	Recall	81.7	FAST-B-736
Scene Text Detection	ICDAR 2015	F-Measure	82.9	FAST-S-736
Scene Text Detection	ICDAR 2015	FPS	53.9	FAST-S-736
Scene Text Detection	ICDAR 2015	Precision	86.3	FAST-S-736
Scene Text Detection	ICDAR 2015	Recall	79.8	FAST-S-736
Scene Text Detection	ICDAR 2015	F-Measure	81.7	FAST-T-736
Scene Text Detection	ICDAR 2015	FPS	60.9	FAST-T-736
Scene Text Detection	ICDAR 2015	Precision	86	FAST-T-736
Scene Text Detection	ICDAR 2015	Recall	77.9	FAST-T-736
Scene Text Detection	MSRA-TD500	F-Measure	87.3	FAST-B-736
Scene Text Detection	MSRA-TD500	FPS	56.8	FAST-B-736
Scene Text Detection	MSRA-TD500	Precision	92.1	FAST-B-736
Scene Text Detection	MSRA-TD500	Recall	83	FAST-B-736
Scene Text Detection	MSRA-TD500	F-Measure	86.4	FAST-S-736
Scene Text Detection	MSRA-TD500	FPS	72	FAST-S-736
Scene Text Detection	MSRA-TD500	Precision	91.6	FAST-S-736
Scene Text Detection	MSRA-TD500	Recall	81.7	FAST-S-736
Scene Text Detection	MSRA-TD500	F-Measure	84.9	FAST-T-736
Scene Text Detection	MSRA-TD500	FPS	79.6	FAST-T-736
Scene Text Detection	MSRA-TD500	Precision	88.1	FAST-T-736
Scene Text Detection	MSRA-TD500	Recall	81.9	FAST-T-736
Scene Text Detection	MSRA-TD500	F-Measure	84.5	FAST-T-512
Scene Text Detection	MSRA-TD500	FPS	137.2	FAST-T-512
Scene Text Detection	MSRA-TD500	Precision	91.1	FAST-T-512
Scene Text Detection	MSRA-TD500	Recall	78.8	FAST-T-512

FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation

Abstract

Results

Related Papers

FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation

Abstract

Results

Related Papers