folktexts

TextsMITIntroduced 2024-07-19

A collection of natural language prompt-completion pairs pertaining to multiple-choice Q&A on benchmark tasks based on US census products. Benchmark tasks are made available through a python package dubbed folktexts. The main goal is to serve as a basis to evaluate LLMs' capabilities of uncertainty quantification on uncertain outcomes, i.e., evaluating quantification of aleatoric uncertainty. This is essentially a natural-language version of the popular folktables tabular data package.