TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MASSIVE: A 1M-Example Multilingual Natural Language Unders...

MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages

Jack FitzGerald, Christopher Hench, Charith Peris, Scott Mackie, Kay Rottmann, Ana Sanchez, Aaron Nash, Liam Urbach, Vishesh Kakarala, Richa Singh, Swetha Ranganath, Laurie Crist, Misha Britan, Wouter Leeuwis, Gokhan Tur, Prem Natarajan

2022-04-18Zero-Shot Intent ClassificationZero-shot Slot Fillingintent-classificationNatural Language UnderstandingSlot FillingXLM-RIntent Classification
PaperPDFCodeCodeCode(official)CodeCodeCode

Abstract

We present the MASSIVE dataset--Multilingual Amazon Slu resource package (SLURP) for Slot-filling, Intent classification, and Virtual assistant Evaluation. MASSIVE contains 1M realistic, parallel, labeled virtual assistant utterances spanning 51 languages, 18 domains, 60 intents, and 55 slots. MASSIVE was created by tasking professional translators to localize the English-only SLURP dataset into 50 typologically diverse languages from 29 genera. We also present modeling results on XLM-R and mT5, including exact match accuracy, intent classification accuracy, and slot-filling F1 score. We have released our dataset, modeling code, and models publicly.

Results

TaskDatasetMetricValueModel
Intent ClassificationMASSIVEIntent Accuracy86.1mT5 Base (encoder-only)
Intent ClassificationMASSIVEIntent Accuracy85.3mT5 Base (text-to-text)
Intent ClassificationMASSIVEIntent Accuracy85.1XLM-R Base
Slot FillingMASSIVESlot F1 Score83.6XLM-R Base
Slot FillingMASSIVESlot F1 Score82.2mT5 Base (encoder-only)
Slot FillingMASSIVESlot F1 Score81.3mT5 Base (text-to-text)
Slot FillingMASSIVESlot F1 Score64.2XLM-R Base
Slot FillingMASSIVESlot F1 Score56.9mT5 Base (encoder-only)
Slot FillingMASSIVESlot F1 Score50.6mT5 Base (text-to-text)

Related Papers

Vision Language Action Models in Robotic Manipulation: A Systematic Review2025-07-14A Survey on Vision-Language-Action Models for Autonomous Driving2025-06-30State and Memory is All You Need for Robust and Reliable AI Agents2025-06-30skLEP: A Slovak General Language Understanding Benchmark2025-06-26SV-LLM: An Agentic Approach for SoC Security Verification using Large Language Models2025-06-25Semantic similarity estimation for domain specific data using BERT and other techniques2025-06-23An Interdisciplinary Review of Commonsense Reasoning and Intent Detection2025-06-16Towards Pervasive Distributed Agentic Generative AI -- A State of The Art2025-06-16