Question Answering on TruthfulQA

Metric: % true (GPT-judge) (higher is better)

LeaderboardDataset
Loading chart...