MolParser-7M
ImagesTextsIntroduced 2024-11-17
A large scale OCSR dataset, proposed in paper “MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild“ MolParser-7M contains nearly 8 million paired image-SMILES data. It should be noted that the caption of image is extended-SMILES format proposed in paper.