LLaMA 65B (zero-shot)
Reported on 6 benchmarks across 3 tasks · 1 paper
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Natural Language Processing6 results
- Accuracy (High)· 2023-02-2751.6best: 92.6 (ALBERTxxlarge+DUMA(ensemble))
- Accuracy (Middle)· 2023-02-2767.9best: 93.1 (Megatron-BERT (ensemble))
- Accuracy· 2023-02-2752.3best: 83.2 (Unicorn 11B (fine-tuned))
- Accuracy· 2023-02-2760.2best: 78.4 (FLAN 137B (zero-shot))
- EM· 2023-02-2768.2best: 87.5 (Claude 2 (few-shot, k=5))
- Accuracy· uses extra data· 2023-02-2756best: 96.4 (GPT-4 (few-shot, k=25))