IVM-Mix-1M
ImagesTextsIntroduced 2024-05-30
IVM-Mix-1M provide over 1M image-instruction pairs with corresponding instruction-relevant mask labels. Our IVM-Mix-1M dataset consists of three part: HumanLabelData, RobotMachineData and VQAMachineData. For the HumanLabelData and RobotMachineData, we provide well-orgnized images, mask label and language instructions. For the VQAMachineData, we only provide mask label and language instructions, please refer to https://huggingface.co/datasets/2toINF/IVM-Mix-1M and download the images from constituting datasets.