TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MixNet: Toward Accurate Detection of Challenging Scene Tex...

MixNet: Toward Accurate Detection of Challenging Scene Text in the Wild

Yu-Xiang Zeng, Jun-Wei Hsieh, Xin Li, Ming-Ching Chang

2023-08-23Scene Text DetectionText Detection
PaperPDFCode(official)

Abstract

Detecting small scene text instances in the wild is particularly challenging, where the influence of irregular positions and nonideal lighting often leads to detection errors. We present MixNet, a hybrid architecture that combines the strengths of CNNs and Transformers, capable of accurately detecting small text from challenging natural scenes, regardless of the orientations, styles, and lighting conditions. MixNet incorporates two key modules: (1) the Feature Shuffle Network (FSNet) to serve as the backbone and (2) the Central Transformer Block (CTBlock) to exploit the 1D manifold constraint of the scene text. We first introduce a novel feature shuffling strategy in FSNet to facilitate the exchange of features across multiple scales, generating high-resolution features superior to popular ResNet and HRNet. The FSNet backbone has achieved significant improvements over many existing text detection methods, including PAN, DB, and FAST. Then we design a complementary CTBlock to leverage center line based features similar to the medial axis of text regions and show that it can outperform contour-based approaches in challenging cases when small scene texts appear closely. Extensive experimental results show that MixNet, which mixes FSNet with CTBlock, achieves state-of-the-art results on multiple scene text detection datasets.

Results

TaskDatasetMetricValueModel
Scene Text DetectionTotal-TextFPS15.2MixNet
Scene Text DetectionTotal-TextPrecision93MixNet
Scene Text DetectionTotal-TextRecall88.1MixNet
Scene Text DetectionSCUT-CTW1500F-Measure89.8MixNet
Scene Text DetectionSCUT-CTW1500FPS15.2MixNet
Scene Text DetectionSCUT-CTW1500Precision91.4MixNet
Scene Text DetectionSCUT-CTW1500Recall88.3MixNet
Scene Text DetectionIC19-ArtH-Mean79.7MixNet
Scene Text DetectionMSRA-TD500F-Measure89.4MixNet
Scene Text DetectionMSRA-TD500FPS15.2MixNet
Scene Text DetectionMSRA-TD500Precision90.7MixNet
Scene Text DetectionMSRA-TD500Recall88.1MixNet

Related Papers

AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models2025-07-07PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning2025-06-18Task-driven real-world super-resolution of document scans2025-06-08CL-ISR: A Contrastive Learning and Implicit Stance Reasoning Framework for Misleading Text Detection on Social Media2025-06-05Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors2025-05-30The Devil is in Fine-tuning and Long-tailed Problems:A New Benchmark for Scene Text Detection2025-05-21Trends and Challenges in Authorship Analysis: A Review of ML, DL, and LLM Approaches2025-05-21AGENT-X: Adaptive Guideline-based Expert Network for Threshold-free AI-generated teXt detection2025-05-21