TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Mistral 7B

Mistral 7B

Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed

2023-10-10Zero-Shot Video Question AnswerQuestion AnsweringMathematical ReasoningMathMath Word Problem SolvingMulti-task Language UnderstandingSentence CompletionCommon Sense Reasoninganswerability predictionChatbotWorld KnowledgeArithmetic ReasoningCode GenerationLanguage Modelling
PaperPDFCodeCode(official)CodeCodeCodeCode

Abstract

We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences of arbitrary length with a reduced inference cost. We also provide a model fine-tuned to follow instructions, Mistral 7B -- Instruct, that surpasses the Llama 2 13B -- Chat model both on human and automated benchmarks. Our models are released under the Apache 2.0 license.

Results

TaskDatasetMetricValueModel
Transfer LearningMMLAverage (%)60.1Mistral 7B (5-shot)
Question AnsweringPeerQAAlignScore0.0827Mistral-v02-7B-32k
Question AnsweringPeerQAPrometheus-2 Answer Correctness3.4245Mistral-v02-7B-32k
Question AnsweringPeerQARouge-L0.1922Mistral-v02-7B-32k
Question AnsweringNatural QuestionsEM28.8Mistral 7B (5-shot)
Question AnsweringPIQAAccuracy83Mistral 7B (0-shot)
Question AnsweringTriviaQAEM69.9Mistral 7B (5-shot)
Question AnsweringNExT-QAAccuracy51.1Mistral (7B)
Question AnsweringNExT-GQAAcc@GQA9.2Mistral (7B)
Question AnsweringIntentQAAccuracy50.4Mistral (7B)
Question AnsweringMATHAccuracy13.1Mistral 7B (maj@4)
Video Question AnsweringNExT-QAAccuracy51.1Mistral (7B)
Video Question AnsweringNExT-GQAAcc@GQA9.2Mistral (7B)
Video Question AnsweringIntentQAAccuracy50.4Mistral (7B)
Code GenerationMBPPAccuracy47.5Mistral 7B (3-shot)
Common Sense ReasoningWinoGrandeAccuracy75.3Mistral 7B (0-shot)
Common Sense ReasoningARC (Challenge)Accuracy55.5Mistral 7B (0-shot)
Common Sense ReasoningARC (Easy)Accuracy80Mistral 7B (0-shot)
Math Word Problem SolvingMATHAccuracy13.1Mistral 7B (maj@4)
Mathematical Question AnsweringMATHAccuracy13.1Mistral 7B (maj@4)
Multi-Task LearningMMLAverage (%)60.1Mistral 7B (5-shot)
Mathematical ReasoningMATHAccuracy13.1Mistral 7B (maj@4)
Sentence CompletionHellaSwagAccuracy81.3Mistral 7B (0-shot)
Arithmetic ReasoningGSM8KAccuracy52.2Mistral 7B (maj@8)
Arithmetic ReasoningGSM8KParameters (Billion)7Mistral 7B (maj@8)
answerability predictionPeerQAMacro F10.4703Mistral-IT-v02-7B-32k

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17