DISL
Fueling Research with A Large Dataset of Solidity Smart Contracts
TextsMITIntroduced 2024-03-25
DISL
The full dataset report is available at: https://arxiv.org/abs/2403.16861
The DISL dataset features a collection of 514, 506 unique Solidity files that have been deployed to Ethereum mainnet. It caters to the need for a large and diverse dataset of real-world smart contracts. DISL serves as a resource for developing machine learning systems and for benchmarking software engineering tools designed for smart contracts.
- Curated by: Gabriele Morello
- License: [MIT]
Instructions to explore the dataset
from datasets import load_dataset
# Load the raw dataset
dataset = load_dataset("ASSERT-KTH/DISL", "raw")
# OR
# Load the decomposed dataset
dataset = load_dataset("ASSERT-KTH/DISL", "decomposed")
# number of rows and columns
num_rows = len(dataset["train"])
num_columns = len(dataset["train"].column_names)
# random row
import random
random_row = random.choice(dataset["train"])
# random source code
random_sc = random.choice(dataset["train"])['source_code']
print(random_sc)