AWD-LSTM 3-layer with Fraternal dropout
Reported on 4 benchmarks across 1 task · 1 paper
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Medical4 results
- Test perplexity· 2017-10-3156.8best: 20.5 (GPT-3 (Zero-Shot))
- Validation perplexity· 2017-10-3158.9best: 36.1 (BERT-Large-CAS)
- Test perplexity· 2017-10-3164.1best: 8.21 (SparseGPT (175B, 50% Sparsity))
- Validation perplexity· 2017-10-3166.8best: 15.69 (GPT-2 (fine-tuned))