mistralai
/

Mistral-Small-24B-Instruct-2501

Model card Files Files and versions

patrickvonplaten commited on Jan 30

Commit

8dc72ef

·

verified ·

1 Parent(s): 3727ce4

Update README.md

Files changed (1) hide show

README.md +33 -0

README.md CHANGED Viewed

@@ -45,6 +45,39 @@ Learn more about Mistral Small in our [blog post](https://mistral.ai/news/mistra
 - **System Prompt:** Maintains strong adherence and support for system prompts.
 - **Tokenizer:** Utilizes a Tekken tokenizer with a 131k vocabulary size.
 ### Basic Instruct Template (V7-Tekken)
 ```

 - **System Prompt:** Maintains strong adherence and support for system prompts.
 - **Tokenizer:** Utilizes a Tekken tokenizer with a 131k vocabulary size.
+## Benchmark results
+### Human evaluated benchmarks
+TODO:
+### Publicly accesible benchmarks
+**Reasoning & Knowledge**
+| Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 |
+|------------|---------------|--------------|---------------|---------------|-------------|
+| mmlu_pro_5shot_cot_instruct | 0.663 | 0.536 | 0.666 | 0.683 | 0.617 |
+| gpqa_main_cot_5shot_instruct | 0.453 | 0.344 | 0.531 | 0.404 | 0.377 |
+**Math & Coding**
+| Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 |
+|------------|---------------|--------------|---------------|---------------|-------------|
+| humaneval_instruct_pass@1 | 0.848 | 0.732 | 0.854 | 0.909 | 0.890 |
+| math_instruct | 0.706 | 0.535 | 0.743 | 0.819 | 0.761 |
+| aime_instruct_maj@16 | 0.133 | 0.067 | 0.2333 | 0.100 | 0.100 |
+**Instruction following**
+| Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 |
+|------------|---------------|--------------|---------------|---------------|-------------|
+| mtbench_dev | 8.35 | 7.86 | 7.96 | 8.26 | 8.33 |
+| wildbench | 52.27 | 48.21 | 50.04 | 52.73 | 56.13 |
+| arena_hard | 0.873 | 0.788 | 0.840 | 0.860 | 0.897 |
+| ifeval | 0.829 | 0.8065 | 0.8835 | 0.8401 | 0.8499 |
 ### Basic Instruct Template (V7-Tekken)
 ```