SCICAP
ImagesTextsCC0 1.0Introduced 2021-10-22
SCICAP is a large-scale image captioning dataset that contains real-world scientific figures and captions. SCICAP was constructed using more than two million images from over 290,000 papers collected and released by arXiv.
Image source: https://arxiv.org/pdf/2110.11624v1.pdf