TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/FAST: Faster Arbitrarily-Shaped Text Detector with Minimal...

FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation

Zhe Chen, Jiahao Wang, Wenhai Wang, Guo Chen, Enze Xie, Ping Luo, Tong Lu

2021-11-03Image ClassificationScene Text DetectionText Detection
PaperPDFCodeCode(official)

Abstract

We propose an accurate and efficient scene text detection framework, termed FAST (i.e., faster arbitrarily-shaped text detector). Different from recent advanced text detectors that used complicated post-processing and hand-crafted network architectures, resulting in low inference speed, FAST has two new designs. (1) We design a minimalist kernel representation (only has 1-channel output) to model text with arbitrary shape, as well as a GPU-parallel post-processing to efficiently assemble text lines with a negligible time overhead. (2) We search the network architecture tailored for text detection, leading to more powerful features than most networks that are searched for image classification. Benefiting from these two designs, FAST achieves an excellent trade-off between accuracy and efficiency on several challenging datasets, including Total Text, CTW1500, ICDAR 2015, and MSRA-TD500. For example, FAST-T yields 81.6% F-measure at 152 FPS on Total-Text, outperforming the previous fastest method by 1.7 points and 70 FPS in terms of accuracy and speed. With TensorRT optimization, the inference speed can be further accelerated to over 600 FPS. Code and models will be released at https://github.com/czczup/FAST.

Results

TaskDatasetMetricValueModel
Scene Text DetectionTotal-TextFPS46FAST-B-800
Scene Text DetectionTotal-TextPrecision90FAST-B-800
Scene Text DetectionTotal-TextRecall85.2FAST-B-800
Scene Text DetectionTotal-TextFPS67.5FAST-B-640
Scene Text DetectionTotal-TextPrecision89.9FAST-B-640
Scene Text DetectionTotal-TextRecall83.2FAST-B-640
Scene Text DetectionTotal-TextFPS93.2FAST-B-512
Scene Text DetectionTotal-TextPrecision89.6FAST-B-512
Scene Text DetectionTotal-TextRecall82.4FAST-B-512
Scene Text DetectionTotal-TextFPS115.5FAST-S-512
Scene Text DetectionTotal-TextPrecision88.3FAST-S-512
Scene Text DetectionTotal-TextRecall81.7FAST-S-512
Scene Text DetectionTotal-TextFPS152.8FAST-T-448
Scene Text DetectionTotal-TextPrecision86.5FAST-T-448
Scene Text DetectionTotal-TextRecall77.2FAST-T-448
Scene Text DetectionSCUT-CTW1500F-Measure84.2FAST-B-640
Scene Text DetectionSCUT-CTW1500FPS66.5FAST-B-640
Scene Text DetectionSCUT-CTW1500Precision87.8FAST-B-640
Scene Text DetectionSCUT-CTW1500Recall80.9FAST-B-640
Scene Text DetectionSCUT-CTW1500F-Measure82.9FAST-B-512
Scene Text DetectionSCUT-CTW1500FPS92.6FAST-B-512
Scene Text DetectionSCUT-CTW1500Precision85.7FAST-B-512
Scene Text DetectionSCUT-CTW1500Recall80.2FAST-B-512
Scene Text DetectionSCUT-CTW1500F-Measure82FAST-S-512
Scene Text DetectionSCUT-CTW1500FPS112.9FAST-S-512
Scene Text DetectionSCUT-CTW1500Precision85.6FAST-S-512
Scene Text DetectionSCUT-CTW1500Recall78.7FAST-S-512
Scene Text DetectionSCUT-CTW1500F-Measure81.5FAST-T-512
Scene Text DetectionSCUT-CTW1500FPS129.1FAST-T-512
Scene Text DetectionSCUT-CTW1500Precision85.5FAST-T-512
Scene Text DetectionSCUT-CTW1500Recall77.9FAST-T-512
Scene Text DetectionICDAR 2015F-Measure87.1FAST-B-1280
Scene Text DetectionICDAR 2015FPS15.7FAST-B-1280
Scene Text DetectionICDAR 2015Precision89.7FAST-B-1280
Scene Text DetectionICDAR 2015Recall84.6FAST-B-1280
Scene Text DetectionICDAR 2015F-Measure86.3FAST-B-896
Scene Text DetectionICDAR 2015FPS31.8FAST-B-896
Scene Text DetectionICDAR 2015Precision89.2FAST-B-896
Scene Text DetectionICDAR 2015Recall83.6FAST-B-896
Scene Text DetectionICDAR 2015F-Measure84.7FAST-B-736
Scene Text DetectionICDAR 2015FPS42.7FAST-B-736
Scene Text DetectionICDAR 2015Precision88FAST-B-736
Scene Text DetectionICDAR 2015Recall81.7FAST-B-736
Scene Text DetectionICDAR 2015F-Measure82.9FAST-S-736
Scene Text DetectionICDAR 2015FPS53.9FAST-S-736
Scene Text DetectionICDAR 2015Precision86.3FAST-S-736
Scene Text DetectionICDAR 2015Recall79.8FAST-S-736
Scene Text DetectionICDAR 2015F-Measure81.7FAST-T-736
Scene Text DetectionICDAR 2015FPS60.9FAST-T-736
Scene Text DetectionICDAR 2015Precision86FAST-T-736
Scene Text DetectionICDAR 2015Recall77.9FAST-T-736
Scene Text DetectionMSRA-TD500F-Measure87.3FAST-B-736
Scene Text DetectionMSRA-TD500FPS56.8FAST-B-736
Scene Text DetectionMSRA-TD500Precision92.1FAST-B-736
Scene Text DetectionMSRA-TD500Recall83FAST-B-736
Scene Text DetectionMSRA-TD500F-Measure86.4FAST-S-736
Scene Text DetectionMSRA-TD500FPS72FAST-S-736
Scene Text DetectionMSRA-TD500Precision91.6FAST-S-736
Scene Text DetectionMSRA-TD500Recall81.7FAST-S-736
Scene Text DetectionMSRA-TD500F-Measure84.9FAST-T-736
Scene Text DetectionMSRA-TD500FPS79.6FAST-T-736
Scene Text DetectionMSRA-TD500Precision88.1FAST-T-736
Scene Text DetectionMSRA-TD500Recall81.9FAST-T-736
Scene Text DetectionMSRA-TD500F-Measure84.5FAST-T-512
Scene Text DetectionMSRA-TD500FPS137.2FAST-T-512
Scene Text DetectionMSRA-TD500Precision91.1FAST-T-512
Scene Text DetectionMSRA-TD500Recall78.8FAST-T-512

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15Transferring Styles for Reduced Texture Bias and Improved Robustness in Semantic Segmentation Networks2025-07-14FedGSCA: Medical Federated Learning with Global Sample Selector and Client Adaptive Adjuster under Label Noise2025-07-13