25 machine learning datasets
25 dataset results
Fallout New Vegas Dialog is a multilingual sentiment annotated dialog dataset from Fallout New Vegas. The game developers have preannotated every line of dialog in the game in one of the 8 different sentiments: anger, disgust, fear, happy, neutral, pained, sad and surprised and they have been translated into 5 different languages: English, Spanish, German, French and Italian.
DialogCC is a large-scale multi-modal dialogue dataset, which covers diverse real-world topics and various images per dialogue. It contains 651k unique images and is designed for image and text retrieval tasks.
Werewolf Among Us is a dataset multimodal dataset for modeling persuasion behaviors. It contains 199 dialogue transcriptions and videos captured in a multi-player social deduction game setting, 26,647 utterance level annotations of persuasion strategy, and game level annotations of deduction game outcomes.
PGDataset (Profile Generation Dataset) is a dataset created for the PGTask (Profile Generation Task), where the goal is to extract/generate a profile sentence given a dialogue utterance.
Dataset for our paper Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky which includes 5000 enterprise tools and the corresponding dialogues generated using DiaFORGE UTC data engine.