GLUE

General Language Understanding Evaluation benchmark

TextsCustom (various)Introduced 2019-01-01

General Language Understanding Evaluation (GLUE) benchmark is a collection of nine natural language understanding tasks, including single-sentence tasks CoLA and SST-2, similarity and paraphrasing tasks MRPC, STS-B and QQP, and natural language inference tasks MNLI, QNLI, RTE and WNLI.

Source: Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models Image Source: https://gluebenchmark.com/

Benchmarks

Natural Language Understanding/Average

Related Benchmarks

GLUE COLA/Classification/Matthews Correlation GLUE COLA/Text Classification/Matthews Correlation GLUE MRPC/Classification/Accuracy GLUE MRPC/Classification/F1 GLUE MRPC/Text Classification/Accuracy GLUE MRPC/Text Classification/F1 GLUE QQP/Few-Shot Learning/F1-score GLUE QQP/Meta-Learning/F1-score GLUE RTE/Classification/Accuracy GLUE RTE/Text Classification/Accuracy GLUE SST2/Classification/Accuracy GLUE SST2/Text Classification/Accuracy GLUE STSB/Classification/Spearmanr GLUE STSB/Text Classification/Spearmanr