RVL-CDIP_N_MP
RVL-CDIP-N multi-page
ImagesTextsapache 2.0Introduced 2023-08-24
RVL-CDIP_MP-N can serve its original goal as a covariate shift test set, now for multi-page document classification. We were able to retrieve the original full documents from DocumentCloud and Web Search.
It has the same label taxonomy as RVL-CDIP (16) with close to 1K documents in PDF format, averaging 10 pages per document.