Ensemble of All
Reported on 6 benchmarks across 1 task · 1 paper · 1 SOTA
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Medical6 results
- Validation perplexity· 2023-11-28SOTA13.11
- Test perplexity· 2023-11-2847.31best: 20.5 (GPT-3 (Zero-Shot))
- Validation perplexity· 2023-11-2848.92best: 36.1 (BERT-Large-CAS)
- Test perplexity· 2023-11-2813.29best: 2.4 (RETRO (7.5B))
- Test perplexity· 2023-11-2853.73best: 8.21 (SparseGPT (175B, 50% Sparsity))
- Validation perplexity· 2023-11-2855.4best: 15.69 (GPT-2 (fine-tuned))