271 machine learning datasets
271 dataset results
This dataset contains the extraction made in 2022 of all the 622 datasets that existed then at the UCI Machine Learning Repository. It contains the index, its name, its url, the instances (number os lines), the number of attributes (columns), the year it was created, the area, such as Life, Social, etc., the web_hits at the time, the data folder url, where the data were in the internet, the dataset_file_url, the URL for the data, the dataset_file_format (format, such as data, txt, Z, etc), the names_file_url, which describe the files with the description of the attributes, the names_file_format which describe the format of the previous file, the attribute_info, which describe the information of all the attributes or columns that are in the dataset, the source, the data_set_information, the relevant_papers associated with this dataset, the papers_that_cite_this_data_set, and a final column with the number of papers that cite this dataset.
Articles originating from subreddits with explicitly stated ideologies are categorized into three groups: 72,488 articles in the Liberal class, 79,573 articles in the Conservative class, and 225,083 articles in the Restricted class.
BASEPROD provides comprehensive rover sensor data collected over a 1.7 km traverse, accompanied by high-resolution 2D and 3D drone maps of the terrain. The dataset also includes laser-induced breakdown spectroscopy (LIBS) measurements from key sampling sites along the rover's path, as well as weather station data to contextualize environmental conditions.
The Insider Threat Test Dataset is a collection of synthetic insider threat test datasets that provide both background and malicious actor synthetic data.
Replication Data for: Integrating Earth Observation Data into Causal Inference: Challenges and Opportunities
The dataset includes time-stamped user product reviews behavior from January, 2008 to October, 2018. Each user has a sequence of produce review events with each event containing the timestamp and category of the reviewed product, with each category corresponding to an event type.
The dataset has two years of user awards on a question-answering website: each user received a sequence of badges and there are 22 different kinds of badges in total.
The dataset contains historical financial transactions, including time, category and cost fields. There are 50000 clients, 205 categories and 43.7M events. The original goal was to predict the age group of the client. In this variant of the dataset, the goal is to forecast multiple future events.
This repository contains the database of the FEM simulation of axially impacted various configurations of the square crash boxes. This database records the impact of the structural and crash test parameters on the various crashworthiness objectives.
Fact-based Text Editing dataset based on WebNLG dataset.
Fact-based Text Editing dataset based on RotoWire dataset
The LinkedResults dataset contains around 1,600 results capturing performance of machine learning models from tables of 239 papers. All tables come from a subset of SegmentedTables dataset. Each result is a tuple of form (task, dataset, metric name, metric value) and is linked to a particular table, row and cell it originates from.
Co/FeMn bilayers measured.
This data contains the election polls for the 2004, 2008, 2012, and 2016 US presidential election by state including data on undecided voter proportions.
The ARPA-E funded TERRA-REF project is generating open-access reference datasets for the study of plant sensing, genomics, and phenomics. Sensor data were generated by a field scanner sensing platform that captures color, thermal, hyperspectral, and active flourescence imagery as well as three dimensional structure and associated environmental measurements. This dataset is provided alongside data collected using traditional field methods in order to support calibration and validation of algorithms used to extract plot level phenotypes from these datasets.
This dataset consisting 500 set of caption, table and coresponding paper page, processed from DocBank.
This dataset are about Nafion 112 membrane standard tests and MEA activation tests of PEM fuel cell in various operation condition. Dataset include two general electrochemical analysis method, Polarization and Impedance curves. In this dataset, effect of different pressure of H2/O2 gas, different voltages and various humidity conditions in several steps are considered. Behavior of PEM fuel cell during distinct operation condition tests, activation procedure and different operation condition before and after activation analysis can be concluded from data. In Polarization curves, voltage and power density change as a function of flows of H2/O2 and relative humidity. Resistance of the used equivalent circuit of fuel cell can be calculated from Impedance data. Thus, experimental response of the cell is obvious in the presented data, which is useful in depth analysis, simulation and material performance investigation in PEM fuel cell researches.
This dataset includes Direct Borohydride Fuel Cell (DBFC) impedance and polarization test in anode with Pd/C, Pt/C and Pd decorated Ni–Co/rGO catalysts. In fact, different concentration of Sodium Borohydride (SBH), applied voltages and various anode catalysts loading with explanation of experimental details of electrochemical analysis are considered in data. Voltage, power density and resistance of DBFC change as a function of weight percent of SBH (%), applied voltage and amount of anode catalyst loading that are evaluated by polarization and impedance curves with using appropriate equivalent circuit of fuel cell. Can be stated that interpretation of electrochemical behavior changes by the data of related cell is inevitable, which can be useful in simulation, power source investigation and depth analysis in DB fuel cell researches.
These are larger MATLAB .mat files required for reproducing plots from the sgbaird-5DOF/interp repository for grain boundary property interpolation. gitID-0055bee_uuID-475a2dfd_paper-data6.mat contains multiple trials of five degree-of-freedom interpolation model runs for various interpolation schemes. gpr46883_gitID-b473165_puuID-50ffdcf6_kim-rng11.mat contains a Gaussian Process Regression model trained on 46883 Fe simulation GBs. See Five degree-of-freedom property interpolation of arbitrary grain boundaries via Voronoi fundamental zone framework DOI: 10.1016/j.commatsci.2021.110756 for the peer-reviewed, published version of the paper.
EUEN17037 Daylight and View Standard Test Dataset.