TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Multimodal Side-Tuning for Document Classification

Multimodal Side-Tuning for Document Classification

Stefano Pio Zingaro, Giuseppe Lisanti, Maurizio Gabbrielli

2023-01-16Transfer LearningDocument Image ClassificationDocument ClassificationClassification
PaperPDFCode(official)

Abstract

In this paper, we propose to exploit the side-tuning framework for multimodal document classification. Side-tuning is a methodology for network adaptation recently introduced to solve some of the problems related to previous approaches. Thanks to this technique it is actually possible to overcome model rigidity and catastrophic forgetting of transfer learning by fine-tuning. The proposed solution uses off-the-shelf deep learning architectures leveraging the side-tuning framework to combine a base model with a tandem of two side networks. We show that side-tuning can be successfully employed also when different data sources are considered, e.g. text and images in document classification. The experimental results show that this approach pushes further the limit for document classification accuracy with respect to the state of the art.

Results

TaskDatasetMetricValueModel
Document Image ClassificationTobacco-3482Accuracy90.5Multimodal Side-Tuning (MobileNetV2)
Document Image ClassificationTobacco-3482Accuracy90.3Multimodal Side-Tuning (ResNet50)
Image ClassificationTobacco-3482Accuracy90.5Multimodal Side-Tuning (MobileNetV2)
Image ClassificationTobacco-3482Accuracy90.3Multimodal Side-Tuning (ResNet50)

Related Papers

RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Safeguarding Federated Learning-based Road Condition Classification2025-07-16Robust-Multi-Task Gradient Boosting2025-07-15AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13