TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/A DeNoising FPN With Transformer R-CNN for Tiny Object Det...

A DeNoising FPN With Transformer R-CNN for Tiny Object Detection

Hou-I Liu, Yu-Wen Tseng, Kai-Cheng Chang, Pin-Jyun Wang, Hong-Han Shuai, Wen-Huang Cheng

2024-06-09DenoisingContrastive Learningobject-detectionObject Detection
PaperPDFCode(official)Code

Abstract

Despite notable advancements in the field of computer vision, the precise detection of tiny objects continues to pose a significant challenge, largely owing to the minuscule pixel representation allocated to these objects in imagery data. This challenge resonates profoundly in the domain of geoscience and remote sensing, where high-fidelity detection of tiny objects can facilitate a myriad of applications ranging from urban planning to environmental monitoring. In this paper, we propose a new framework, namely, DeNoising FPN with Trans R-CNN (DNTR), to improve the performance of tiny object detection. DNTR consists of an easy plug-in design, DeNoising FPN (DN-FPN), and an effective Transformer-based detector, Trans R-CNN. Specifically, feature fusion in the feature pyramid network is important for detecting multiscale objects. However, noisy features may be produced during the fusion process since there is no regularization between the features of different scales. Therefore, we introduce a DN-FPN module that utilizes contrastive learning to suppress noise in each level's features in the top-down path of FPN. Second, based on the two-stage framework, we replace the obsolete R-CNN detector with a novel Trans R-CNN detector to focus on the representation of tiny objects with self-attention. Experimental results manifest that our DNTR outperforms the baselines by at least 17.4% in terms of APvt on the AI-TOD dataset and 9.6% in terms of AP on the VisDrone dataset, respectively. Our code will be available at https://github.com/hoiliu-0801/DNTR.

Results

TaskDatasetMetricValueModel
Object DetectionAI-TODAP26.2DNTR
Object DetectionAI-TODAP5056.7DNTR
Object DetectionAI-TODAP7520.2DNTR
Object DetectionAI-TODAPm37DNTR
Object DetectionAI-TODAPs31DNTR
Object DetectionAI-TODAPt26.4DNTR
Object DetectionAI-TODAPvt12.8DNTR
3DAI-TODAP26.2DNTR
3DAI-TODAP5056.7DNTR
3DAI-TODAP7520.2DNTR
3DAI-TODAPm37DNTR
3DAI-TODAPs31DNTR
3DAI-TODAPt26.4DNTR
3DAI-TODAPvt12.8DNTR
2D ClassificationAI-TODAP26.2DNTR
2D ClassificationAI-TODAP5056.7DNTR
2D ClassificationAI-TODAP7520.2DNTR
2D ClassificationAI-TODAPm37DNTR
2D ClassificationAI-TODAPs31DNTR
2D ClassificationAI-TODAPt26.4DNTR
2D ClassificationAI-TODAPvt12.8DNTR
2D Object DetectionAI-TODAP26.2DNTR
2D Object DetectionAI-TODAP5056.7DNTR
2D Object DetectionAI-TODAP7520.2DNTR
2D Object DetectionAI-TODAPm37DNTR
2D Object DetectionAI-TODAPs31DNTR
2D Object DetectionAI-TODAPt26.4DNTR
2D Object DetectionAI-TODAPvt12.8DNTR
16kAI-TODAP26.2DNTR
16kAI-TODAP5056.7DNTR
16kAI-TODAP7520.2DNTR
16kAI-TODAPm37DNTR
16kAI-TODAPs31DNTR
16kAI-TODAPt26.4DNTR
16kAI-TODAPvt12.8DNTR

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17