FUNSD

Form Understanding in Noisy Scanned Documents

ImagesTextsCustom

Form Understanding in Noisy Scanned Documents (FUNSD) comprises 199 real, fully annotated, scanned forms. The documents are noisy and vary widely in appearance, making form understanding (FoUn) a challenging task. The proposed dataset can be used for various tasks, including text detection, optical character recognition, spatial layout analysis, and entity labeling/linking.

Source: FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents

Image source: https://guillaumejaume.github.io/FUNSD/