TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/PixelLink: Detecting Scene Text via Instance Segmentation

PixelLink: Detecting Scene Text via Instance Segmentation

Dan Deng, Haifeng Liu, Xuelong. Li, Deng Cai

2018-01-04Text ClassificationregressionScene Text DetectionSegmentationSemantic SegmentationInstance SegmentationText Detection
PaperPDFCodeCodeCodeCodeCode

Abstract

Most state-of-the-art scene text detection algorithms are deep learning based methods that depend on bounding box regression and perform at least two kinds of predictions: text/non-text classification and location regression. Regression plays a key role in the acquisition of bounding boxes in these methods, but it is not indispensable because text/non-text prediction can also be considered as a kind of semantic segmentation that contains full location information in itself. However, text instances in scene images often lie very close to each other, making them very difficult to separate via semantic segmentation. Therefore, instance segmentation is needed to address this problem. In this paper, PixelLink, a novel scene text detection algorithm based on instance segmentation, is proposed. Text instances are first segmented out by linking pixels within the same instance together. Text bounding boxes are then extracted directly from the segmentation result without location regression. Experiments show that, compared with regression-based methods, PixelLink can achieve better or comparable performance on several benchmarks, while requiring many fewer training iterations and less training data.

Results

TaskDatasetMetricValueModel
Scene Text DetectionICDAR 2013Precision88.6PixelLink+VGG16 2s MS
Scene Text DetectionICDAR 2013Recall87.5PixelLink+VGG16 2s MS
Scene Text DetectionICDAR 2015F-Measure84.5SLPR
Scene Text DetectionICDAR 2015Precision85.5SLPR
Scene Text DetectionICDAR 2015Recall83.6SLPR
Scene Text DetectionMSRA-TD500F-Measure77.8PixelLink + VGG16 2s
Scene Text DetectionMSRA-TD500Precision83PixelLink + VGG16 2s
Scene Text DetectionMSRA-TD500Recall73.2PixelLink + VGG16 2s

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20Making Language Model a Hierarchical Classifier and Generator2025-07-17Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17