Efficient few-shot learning for pixel-precise handwritten document layout analysis
Axel De Nardin, Silvia Zottin, Matteo Paier, Gian Luca Foresti, Emanuela Colombi, Claudio Piciarelli
Abstract
Layout analysis is a task of uttermost importance in ancient handwritten document analysis and represents a fundamental step toward the simplification of subsequent tasks such as optical character recognition and automatic transcription. However, many of the approaches adopted to solve this problem rely on a fully supervised learning paradigm. While these systems achieve very good performance on this task, the drawback is that pixel-precise text labeling of the entire training set is a very time-consuming process, which makes this type of information rarely available in a real-world scenario. In the present paper, we address this problem by proposing an efficient few-shot learning framework that achieves performances comparable to current state-of-the-art fully supervised methods on the publicly available DIVA-HisDB dataset.
Results
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | DIVA-HisDB | Mean IoU (class) | 96.3 | WACV '23 (few-shot) |
| 10-shot image generation | DIVA-HisDB | Mean IoU (class) | 96.3 | WACV '23 (few-shot) |