19,997 machine learning datasets
19,997 dataset results
A short clip of video may contain progression of multiple events and an interesting story line. A human needs to capture both the event in every shot and associate them together to understand the story behind it.
GENIE, which stands for GENeratIve Evaluation, is a system designed to standardize human evaluations across different text generation tasks. It was introduced to produce consistent evaluations that are reproducible over time and across different populations. The system is instantiated with datasets representing four core challenges in text generation: machine translation, summarization, commonsense reasoning, and machine comprehension. For each task, GENIE offers a leaderboard that automatically crowdsources annotations for submissions, evaluating them along axes such as correctness, conciseness, and fluency.
The ePiC dataset is a unique and high-quality crowdsourced collection of narratives specifically designed for testing abstract language understanding in the context of proverbs. The ePiC dataset stands out for its focus on abstract reasoning in language models, a relatively under-explored area in natural language processing.
characterRelations dataset contains 2,170 annotations of character relations in 109 literary texts, as documented in characterRelations.pdf. Each annotation describes a character dyad along four dimensions of interest: coarse-grained category (social, familial, professional), fine-grained category (e.g., friend, lover, parent, rival, employer), and affinity (positive, negative, neutral). Additionally, we do not assume that this relationship is static; we also collect judgments as to whether it changes at any point in the course of the text.
The ScandiQA dataset is a question-answering dataset specifically constructed for the Mainland Scandinavian languages, which include Danish, Norwegian, and Swedish. It was developed as part of the ScandEval benchmarking platform and consists of questions and answers in these languages. The dataset is designed to facilitate the evaluation of language models' ability to comprehend and respond to questions in the Scandinavian languages. It is one of the contributions of the ScandEval project, aiming to advance the state of natural language processing in the Scandinavian languages.
''I have read and agree to the terms and conditions'' is one of the biggest lies on the Internet. Consumers rarely read the contracts they are required to accept. We conclude agreements over the Internet daily. But do we know the content of these agreements? Do we check potential unfair statements? On the Internet, we probably skip most of the Terms and Conditions. However, we must remember that we have concluded many more contracts. Imagine that we want to buy a house, a car, send our kids to the nursery, open a bank account, or many more. In all these situations, you will need to conclude the contract, but there is a high probability that you will not read the entire agreement with proper understanding. European consumer law aims to prevent businesses from using so-called ''unfair contractual terms'' in their unilaterally drafted contracts, requiring consumers to accept.
Multi-CPR is a multi-domain Chinese dataset for passage retrieval. The data is collected from three different domains, including E-commerce, Entertainment video, and Medical. Each dataset contains millions of passages and a certain amount of human-annotated query-passage-related pairs.
The DialSummEval is a multi-faceted dataset of human judgments. It was created to revisit the evaluation of dialogue summarization models. The dataset contains the outputs of 14 models on SAMSum, a dialogue summary dataset.
DiFair serves as a meticulous endeavor to address the oversight in evaluating the impact of bias mitigation on useful gender knowledge while assessing gender neutrality in pretrained language models. This metric delves into not only quantifying a model's biased tendencies but also assessing the preservation of useful gender knowledge.
COPEN is a COnceptual knowledge Probing benchmark that aims to analyze the conceptual understanding capabilities of Pre-trained Language Models (PLMs). Specifically, COPEN consists of three tasks:
We collect a total of 13,380 images captured on 2,210 different scenes. We use different objects and backgrounds to build our dataset.
RPLAN - a manually collected large-scale densely annotated dataset of floor plans from real residential buildings.
1. DATASET We created various types of network attacks in Internet of Things (IoT) environment for academic purpose. Two typical smart home devices -- SKT NUGU (NU 100) and EZVIZ Wi-Fi Camera (C2C Mini O Plus 1080P) -- were used. All devices, including some laptops or smart phones, were connected to the same wireless network. The dataset consists of 42 raw network packet files (pcap) at different time points.
X-rays, CT Images and Genomic Sequences representing cases of tuberculosis.
The OpenCitations Meta database stores and delivers bibliographic metadata for all publications involved in the OpenCitations Index.
3D meshes of various garments of various sizes draped on people with various body poses and shapes.
collected by one VLP-16 in a small vehicle (1m x 1m)
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
The dataset contains historical technical data of Dhaka Stock Exchange (DSE). The data was collected from different sources found in the internet where the data was publicly available. The data available here are used for information and research purposes and though to the best of our knowledge, it does not contain any mistakes, there might still be some mistakes. It is not encourages to use this dataset for portfolio management purposes and use this dataset out of your own interest. The contributors do not hold any liability if it is used for any purposes.
Graph Neural Networks (GNNs) have gained traction across different domains such as transportation, bio-informatics, language processing, and computer vision. However, there is a noticeable absence of research on applying GNNs to supply chain networks. Supply chain networks are inherently graphlike in structure, making them prime candidates for applying GNN methodologies. This opens up a world of possibilities for optimizing, predicting, and solving even the most complex supply chain problems. A major setback in this approach lies in the absence of real-world benchmark datasets to facilitate the research and resolution of supply chain problem using GNNs. To address the issue, we present a real-world benchmark dataset for temporal tasks, obtained from one of the leading FMCG companies in Bangladesh, focusing on supply chain planning for production purposes. The dataset includes temporal data as node features to enable sales predictions, production planning, and the identification of fact