Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Sentence Ordering
/
EconLogicQA
Sentence Ordering on EconLogicQA
Metric: Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
#
Model
↕
Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
GPT-4-Turbo
0.5692
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
2
GPT-4
0.5538
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
3
GPT-3.5-Turbo
0.3769
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
4
Llama-3-8B-Instruct
0.3462
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
5
Mistral-7B-Instruct-v0.2
0.3154
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
6
Mistral-7B-v0.1
0.2615
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
7
Mistral-7B-v0.2
0.2615
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
8
Llama-3-8B
0.2385
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
9
Zephyr-7B-Alpha
0.2308
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
10
Yi-6B-Chat
0.2077
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
11
Zephyr-7B-Beta
0.1769
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
12
Mistral-7B-Instruct-v0.1
0.1538
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
13
Llama-2-13B-Chat
0.1462
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
14
Llama-2-7B-Chat
0.0923
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
15
Gemma-2B-IT
0.0846
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
16
Yi-6B
0.0385
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
17
Gemma-7B-IT
0.0231
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code
18
Llama-2-7B
0.0077
No
EconLogicQA: A Question-Answering Benchmark for ...
2024-05-13
Code