Ensemble of All

Reported on 6 benchmarks across 1 task · 1 paper · 1 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Medical6 results

Language ModellingonWikiText-103
Validation perplexity· 2023-11-28
13.11
SOTA
Advancing State of the Art in Language Modeling arXiv:2312.03735
Language ModellingonPenn Treebank (Word Level)
Test perplexity· 2023-11-28
47.31
best: 20.5 (GPT-3 (Zero-Shot))
Advancing State of the Art in Language Modeling arXiv:2312.03735
Language ModellingonPenn Treebank (Word Level)
Validation perplexity· 2023-11-28
48.92
best: 36.1 (BERT-Large-CAS)
Advancing State of the Art in Language Modeling arXiv:2312.03735
Language ModellingonWikiText-103
Test perplexity· 2023-11-28
13.29
best: 2.4 (RETRO (7.5B))
Advancing State of the Art in Language Modeling arXiv:2312.03735
Language ModellingonWikiText-2
Test perplexity· 2023-11-28
53.73
best: 8.21 (SparseGPT (175B, 50% Sparsity))
Advancing State of the Art in Language Modeling arXiv:2312.03735
Language ModellingonWikiText-2
Validation perplexity· 2023-11-28
55.4
best: 15.69 (GPT-2 (fine-tuned))
Advancing State of the Art in Language Modeling arXiv:2312.03735