LLaMA 33B (zero-shot)
Reported on 6 benchmarks across 3 tasks · 1 paper
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Natural Language Processing6 results
- Accuracy (High)· 2023-02-2748.3best: 92.6 (ALBERTxxlarge+DUMA(ensemble))
- Accuracy (Middle)· 2023-02-2764.1best: 93.1 (Megatron-BERT (ensemble))
- Accuracy· 2023-02-2750.4best: 83.2 (Unicorn 11B (fine-tuned))
- EM· 2023-02-2724.9best: 64 (Atlas (full, Wiki-dec-2018 index))
- Accuracy· 2023-02-2758.6best: 78.4 (FLAN 137B (zero-shot))
- Accuracy· 2023-02-2757.8best: 96.4 (GPT-4 (few-shot, k=25))