TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/MML

MML

Massive Multitask Language Understanding

TextsCustomIntroduced 2020-09-07

MMLU (Massive Multitask Language Understanding) is a new benchmark designed to measure knowledge acquired during pretraining by evaluating models exclusively in zero-shot and few-shot settings. This makes the benchmark more challenging and more similar to how we evaluate humans. The benchmark covers 57 subjects across STEM, the humanities, the social sciences, and more. It ranges in difficulty from an elementary level to an advanced professional level, and it tests both world knowledge and problem solving ability. Subjects range from traditional areas, such as mathematics and history, to more specialized areas like law and ethics. The granularity and breadth of the subjects makes the benchmark ideal for identifying a model’s blind spots.

Image source: https://arxiv.org/pdf/2009.03300v3.pdf

Benchmarks

Multi-Task Learning/Average (%)Question Answering/AccuracyTransfer Learning/Average (%)

Related Benchmarks

MMLU (5-Shot)/Multi-Task Learning/MMLU (5-shot)MMLU (5-Shot)/Transfer Learning/MMLU (5-shot)MMLU (Abstract Algebra)/Question Answering/AccuracyMMLU (Anatomy)/Question Answering/AccuracyMMLU (Astronomy)/Question Answering/AccuracyMMLU (Clinical Knowledge)/Question Answering/AccuracyMMLU (College Biology)/Question Answering/AccuracyMMLU (College Chemistry)/Question Answering/AccuracyMMLU (College Computer Science)/Question Answering/AccuracyMMLU (College Mathematics)/Question Answering/AccuracyMMLU (College Medicine)/Question Answering/AccuracyMMLU (College Physics)/Question Answering/AccuracyMMLU (Econometrics)/Question Answering/AccuracyMMLU (Electrical Engineer)/Question Answering/AccuracyMMLU (Elementary Mathematics)/Question Answering/AccuracyMMLU (Formal Logic)/Question Answering/AccuracyMMLU (High School Biology)/Question Answering/AccuracyMMLU (High School Chemistry)/Question Answering/AccuracyMMLU (High School Computer Science)/Question Answering/AccuracyMMLU (High School Mathematics)/Question Answering/AccuracyMMLU (High School Physics)/Question Answering/AccuracyMMLU (High School Statistics)/Question Answering/AccuracyMMLU (Machine Learning)/Question Answering/AccuracyMMLU (Medical Genetics)/Question Answering/AccuracyMMLU (Professional medicine)/Question Answering/AccuracyMMLU-Pro/MMLU/0-shot MRR

Statistics

Papers
1,922
Benchmarks
3

Links

Homepage

Tasks

Multi-Task LearningMulti-task Language UnderstandingMultiple Choice Question Answering (MCQA)Natural Language UnderstandingQuestion AnsweringSingle Choice QuestionText GenerationTransfer Learning