TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/I3CL:Intra- and Inter-Instance Collaborative Learning for ...

I3CL:Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection

Bo Du, Jian Ye, Jing Zhang, Juhua Liu, DaCheng Tao

2021-08-03Scene Text DetectionText Detection
PaperPDFCode(official)

Abstract

Existing methods for arbitrary-shaped text detection in natural scenes face two critical issues, i.e., 1) fracture detections at the gaps in a text instance; and 2) inaccurate detections of arbitrary-shaped text instances with diverse background context. To address these issues, we propose a novel method named Intra- and Inter-Instance Collaborative Learning (I3CL). Specifically, to address the first issue, we design an effective convolutional module with multiple receptive fields, which is able to collaboratively learn better character and gap feature representations at local and long ranges inside a text instance. To address the second issue, we devise an instance-based transformer module to exploit the dependencies between different text instances and a global context module to exploit the semantic context from the shared background, which are able to collaboratively learn more discriminative text feature representation. In this way, I3CL can effectively exploit the intra- and inter-instance dependencies together in a unified end-to-end trainable framework. Besides, to make full use of the unlabeled data, we design an effective semi-supervised learning method to leverage the pseudo labels via an ensemble strategy. Without bells and whistles, experimental results show that the proposed I3CL sets new state-of-the-art results on three challenging public benchmarks, i.e., an F-measure of 77.5% on ICDAR2019-ArT, 86.9% on Total-Text, and 86.4% on CTW-1500. Notably, our I3CL with the ResNeSt-101 backbone ranked 1st place on the ICDAR2019-ArT leaderboard. The source code will be available at https://github.com/ViTAE-Transformer/ViTAE-Transformer-Scene-Text-Detection.

Results

TaskDatasetMetricValueModel
Scene Text DetectionTotal-TextPrecision89.8I3CL + SSL(ResNet-50)
Scene Text DetectionTotal-TextRecall84.2I3CL + SSL(ResNet-50)
Scene Text DetectionSCUT-CTW1500F-Measure86.5I3CL + SSL
Scene Text DetectionSCUT-CTW1500Precision88.4I3CL + SSL
Scene Text DetectionSCUT-CTW1500Recall84.6I3CL + SSL

Related Papers

AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models2025-07-07PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning2025-06-18Task-driven real-world super-resolution of document scans2025-06-08CL-ISR: A Contrastive Learning and Implicit Stance Reasoning Framework for Misleading Text Detection on Social Media2025-06-05Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors2025-05-30The Devil is in Fine-tuning and Long-tailed Problems:A New Benchmark for Scene Text Detection2025-05-21Trends and Challenges in Authorship Analysis: A Review of ML, DL, and LLM Approaches2025-05-21AGENT-X: Adaptive Guideline-based Expert Network for Threshold-free AI-generated teXt detection2025-05-21