Papers With Code 2 | ML Benchmarks, SotA Results & Code

It is a large-scale multimodal patent dataset with detailed captions for design patent figures.

💥 Our dataset includes half a million design patents comprising 3.61 million figures along with captions from patents granted by the United States Patent and Trademark Office USPTO over a 16-year period from 2007 to 2022.

Multimodality: We introduce a multimodal patent dataset that includes patent images, metadata, and detailed captions to support a variety of NLP, vision, and multimodal tasks. This dataset is also valuable for patent analysis tasks such as classification, retrieval, prior art searches, and design trend analysis.
Comprehensive dataset: We have compiled a collection of 435,101 patents spanning 16 years from U.S. design patent documents. This extensive collection includes a total of 3,609,805 drawing figures. Additionally, our dataset consists of eleven fields such as the title, patent ID, claims, date of publication, classification code, and extensive image-related information, including the number of images per patent and descriptions of the viewpoints.
Descriptive captions: To address the absence of descriptions about the designs, such as features and shapes, we generate elaborated captions by employing a vision-language model. It generates descriptive captions for the design figures, capturing details from the sketch. These captions, coupled with the images, enrich our dataset and becomes a valuable resource for advanced patent analysis and multimodal research applications.

IMPACT Patent

Benchmarks