19,997 machine learning datasets
19,997 dataset results
ABO is a large-scale dataset designed for material prediction and multi-view retrieval experiments. The dataset contains Blender renderings of 30 viewpoints for each of the 7,953 3D objects, as well as camera intrinsics and extrinsic for each rendering.
DrawBench is a comprehensive and challenging benchmark for text-to-image models, introduced by the Imagen research team. Let me provide you with more details:
MathVerse is an innovative benchmark specifically designed to rigorously evaluate the capabilities of Multi-modal Large Language Models (MLLMs) in interpreting and reasoning with visual information in mathematical problems. Developed by a research team from CUHK MMLab and Shanghai Artificial Intelligence Laboratory, MathVerse offers an equitable and comprehensive assessment of MLLMs' ability to understand and process visual diagrams for mathematical reasoning.
The Radboud Faces Database (RaFD) is a set of pictures of 67 models (both adult and children, males and females) displaying 8 emotional expressions.
We release Douban Conversation Corpus, comprising a training data set, a development set and a test set for retrieval based chatbot. The statistics of Douban Conversation Corpus are shown in the following table.
CoMA contains 17,794 meshes of the human face in various expressions
iSAID contains 655,451 object instances for 15 categories across 2,806 high-resolution images. The images of iSAID is the same as the DOTA-v1.0 dataset, which are manily collected from the Google Earth, some are taken by satellite JL-1, the others are taken by satellite GF-2 of the China Centre for Resources Satellite Data and Application.
The MetaQA dataset consists of a movie ontology derived from the WikiMovies Dataset and three sets of question-answer pairs written in natural language: 1-hop, 2-hop, and 3-hop queries.
ChestX-ray8 is a medical imaging dataset which comprises 108,948 frontal-view X-ray images of 32,717 (collected from the year of 1992 to 2015) unique patients with the text-mined eight common disease labels, mined from the text radiological reports via NLP techniques.
The INTERACTION dataset contains naturalistic motions of various traffic participants in a variety of highly interactive driving scenarios from different countries. The dataset can serve for many behavior-related research areas, such as
5987 high spatial resolution (0.3 m) remote sensing images from Nanjing, Changzhou, and Wuhan Focus on different geographical environments between Urban and Rural Advance both semantic segmentation and domain adaptation tasks Three considerable challenges: Multi-scale objects Complex background samples Inconsistent class distributions
The XSTest dataset is a test suite designed to identify exaggerated safety behaviors in large language models. It was introduced to systematically study the phenomenon where some models refuse even clearly safe prompts if they use similar language to unsafe prompts or mention sensitive topics.
The Oulu-CASIA NIR&VIS facial expression database consists of six expressions (surprise, happiness, sadness, anger, fear and disgust) from 80 people between 23 and 58 years old. 73.8% of the subjects are males. The subjects were asked to sit on a chair in the observation room in a way that he/ she is in front of camera. Camera-face distance is about 60 cm. Subjects were asked to make a facial expression according to an expression example shown in picture sequences. The imaging hardware works at the rate of 25 frames per second and the image resolution is 320 × 240 pixels.
Volleyball is a video action recognition dataset. It has 4830 annotated frames that were handpicked from 55 videos with 9 player action labels and 8 team activity labels. It contains group activity annotations as well as individual activity annotations.
The Light Field Saliency Database (LFSD) contains 100 light fields with 360×360 spatial resolution. A rough focal stack and an all-focus image are provided for each light field. The images in this dataset usually have one salient foreground object and a background with good color contrast.
Places-LT has an imbalanced training set with 62,500 images for 365 classes from Places-2. The class frequencies follow a natural power law distribution with a maximum number of 4,980 images per class and a minimum number of 5 images per class. The validation and testing sets are balanced and contain 20 and 100 images per class respectively.
The AMiner Dataset is a collection of different relational datasets. It consists of a set of relational networks such as citation networks, academic social networks or topic-paper-autor networks among others.
The ReferIt dataset contains 130,525 expressions for referring to 96,654 objects in 19,894 images of natural scenes.
The ProofWriter dataset contains many small rulebases of facts and rules, expressed in English. Each rulebase also has a set of questions (English statements) which can either be proven true or false using proofs of various depths, or the answer is “Unknown” (in open-world setting, OWA) or assumed negative (in closed-world setting, CWA).