Ahma models
Collection
13 items
•
Updated
•
3
More info coming later
More info coming later
More info coming later
This Ahma-Gemma-3-4B-Instruct-v1.0 model was primarily evaluated using MTBench Finnish by LumiOpen
Single-turn results:
| Benchmark | Ahma 3B base (instruct prompt format) | Ahma 7B Instruct (instruct prompt format) | Ahma-Gemma-3-4B-Instruct-v1.0 |
|---|---|---|---|
| Coding | 1.00 | 1.00 | 4.2 |
| Extraction | 1.30 | 3.00 | 7.3 |
| Humanities | 6.20 | 8.00 | 8.9 |
| Math | 3.20 | 2.90 | 6.1 |
| Reasoning | 4.60 | 5.70 | 4.8 |
| Roleplay | 6.50 | 7.20 | 7.7 |
| STEM | 5.95 | 7.30 | 9.9 |
| Writing | 9.00 | 8.80 | 9.2 |
| Overall Average | 4.72 | 5.50 | 7.26 |
Multi-turn results:
| Benchmark | Ahma 3B Instruct (instruct prompt format) | Ahma 7B Instruct (instruct prompt format) | Ahma-Gemma-3-4B-Instruct-v1.0 | Poro 34B Chat | Poro-2-8B-Instruct |
|---|---|---|---|---|---|
| Coding | 1.00 | 1.05 | 4.35 | 3.70 | ? |
| Extraction | 1.15 | 2.65 | 6.55 | 6.37 | ? |
| Humanities | 6.20 | 7.85 | 6.55 | 9.25 | ? |
| Math | 2.70 | 2.40 | 4.80 | 1.20 | ? |
| Reasoning | 3.50 | 4.50 | 4.40 | 4.35 | ? |
| Roleplay | 6.40 | 6.60 | 7.26 | 7.35 | ? |
| STEM | 4.78 | 5.40 | 8.80 | 7.80 | ? |
| Writing | 6.65 | 6.25 | 7.6 | 8.50 | ? |
| Overall Average | 4.05 | 4.59 | 6.57 | 6.06 | 6.75 |
As we can see, the Ahma-Gemma-3-4B-Instruct-v1.0 model improves upon our previous model generation. We have already started to work on the datasets and methods to improve this model/scale to bigger models
This project would not have been possible without compute generously provided by Google through the TPU Research Cloud.
Datacrunch/Verda for sponsoring us some compute for Finetuning: HF Org (https://huggingface.co/datacrunch) Website: (https://verda.com/)
Feel free to contact us for more details 🤗