---
library_name: transformers
tags:
- gemma3
- instruct
- mamaylm
- insait
license: gemma
language:
- uk
- en
base_model:
- google/gemma-3-12b-it
- google/gemma-3-12b-pt
pipeline_tag: image-text-to-text
datasets:
- Goader/kobza
- HuggingFaceFW/fineweb-2
- HPLT/HPLT2.0_cleaned
- wikimedia/wikipedia
- HuggingFaceTB/smoltalk2
- open-r1/Mixture-of-Thoughts
---

# INSAIT-Institute/MamayLM-Gemma-3-12B-IT-v1.0

![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F637e1f8cf7e01589cc17bf7e%2Fp6d0YFHjWCQ3S12jWqO1m.png)

INSAIT introduces **MamayLM-Gemma-3-12B-IT-v1.0**, the best performing Ukrainian language model based on **google/gemma-3-12b** and **google/gemma-3-12b-it**. MamayLM-Gemma-3-12B-IT-v1.0 is **free to use** and distributed under the [Gemma Terms of Use](https://ai.google.dev/gemma/terms). This model was created by [`INSAIT`](https://insait.ai/), part of Sofia University St. Kliment Ohridski, in Sofia, Bulgaria.

# Model description

The model was built on top of Google’s Gemma 3 12B open models. It was continuously pre-trained on a large pre-filtered dataset using the combination of data mixing and model merging, allowing the model to gain outstanding Ukrainian cultural and linguistic capabilities while retaining its English performance.  During the pre-training stage, we use various datasets, including Ukrainian web crawl data (Kobza), freely available datasets such as Wikipedia, a range of specialized Ukrainian datasets, and machine translations of popular English datasets. The model was then instruction-fine-tuned on a newly constructed Ukrainian instruction dataset created using machine translations of current best English datasets and specialized Ukrainian datasets, prepared by Ukrainian community. For more information check our [blogpost](http://blog.mamaylm.insait.ai) (available in English and Ukrainian).

# Benchmarks and Results

![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F650ed7adf141bc34f91a12ae%2FviINoBT15cG5AxU5xFPgz.png)

![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F650ed7adf141bc34f91a12ae%2FGvCFFl2NQVxxnRa3pjD2V.png)

We evaluate our models on a set of standard English benchmarks, a translated version of them in Ukrainian, as well as, Ukrainian specific benchmarks we collected:

- **Winogrande challenge**: testing world knowledge and understanding
- **Hellaswag**: testing sentence completion
- **ARC Easy/Challenge**: testing logical reasoning
- **TriviaQA**: testing trivia knowledge
- **GSM-8k**: solving multiple-choice questions in high-school mathematics
- **MMLU**: testing knowledge on a multitude of topics
- **IFEval**: testing instruction-following skills
- **ZNO**: testing knowledge of the Ukrainian high school curriculum in Ukrainian language & literature, history, mathematics and geography

These benchmarks test logical reasoning, mathematics, knowledge, language understanding and other skills of the models and are provided at https://github.com/insait-institute/lm-evaluation-harness-uk. The graphs above show the performance of MamayLM 12B compared to other large open models. The results show the excellent abilities of MamayLM in Ukrainian, which allow them to **outperform much larger models**, including Alibaba’s Qwen 2.5 72B and Meta’s Llama3.1 70B. Finally, our models retain the **excellent English performance** inherited from the original Google Gemma 3 models upon which they are based.

![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F650ed7adf141bc34f91a12ae%2FZc6vtA12ohuX5_S8ETN8Q.png)

MamayLM v1.0 12B also shows improved performance on visual benchmarks like MMMU and ZNO-Vision(MMZNO):

![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F650ed7adf141bc34f91a12ae%2FW0MQUv6OSnEDMCVAD7kLy.png)

![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F650ed7adf141bc34f91a12ae%2FweS08Z8wdbb3mkm3pB75z.png)

# Use in 🤗 Transformers
First install the latest version of the transformers library:
```
pip install -U 'transformers[torch]'
```
Then load the model in transformers:
```python
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
    "INSAIT-Institute/MamayLM-Gemma-3-12B-IT-v1.0",
    torch_dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",
    device_map="auto",
)
```

# Recommended Parameters

For optimal performance, we recommend the following parameters for text generation, as we have extensively tested our model with them:

```python
from transformers import GenerationConfig
generation_params = GenerationConfig(
    max_new_tokens=2048,              # Choose maximum generation tokens
    temperature=0.1,
    top_k=25,
    top_p=1,
    repetition_penalty=1.1,
    # eos_token_id=[1,106],
    do_sample=True
)
```

In principle, increasing temperature should work adequately as well.

# Instruction format

In order to leverage instruction fine-tuning, your prompt should begin with a beginning-of-sequence token `<bos>` and be formatted in the Gemma 3 chat template. `<bos>` should only be the first token in a chat sequence.

E.g.
```
<bos><start_of_turn>user
Хто такий Козак Мамай?<end_of_turn>
<start_of_turn>model
 
```

This format is also available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating) via the `apply_chat_template()` method:

```python
tokenizer = AutoTokenizer.from_pretrained(
    "INSAIT-Institute/MamayLM-Gemma-3-12B-IT-v1.0",
    use_default_system_prompt=False,
)
messages = [
    {"role": "user", "content": "Хто такий Козак Мамай?"},
]
input_ids = tokenizer.apply_chat_template(
  messages,
  return_tensors="pt",
  add_generation_prompt=True,
  return_dict=True
)
outputs = model.generate(
  **input_ids,
  generation_config=generation_params
)
print(tokenizer.decode(outputs[0]))
```

# Use with vLLM

Example usage with vLLM:

```python
from vllm import LLM, SamplingParams
from vllm.inputs import TokensPrompt
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(
    "INSAIT-Institute/MamayLM-Gemma-3-12B-IT-v1.0",
    use_default_system_prompt=False,
)
sampling_params = SamplingParams(
    max_tokens=2048,
    temperature=0.1,
    top_k=25,
    top_p=1,
    repetition_penalty=1.1,
    stop_token_ids=[1, 106],
)
llm = LLM(
    model="INSAIT-Institute/MamayLM-Gemma-3-12B-IT-v1.0",
    dtype="bfloat16",
    # enforce_eager=True
)
messages = [
    {"role": "user", "content": "Хто такий Козак Мамай?"},
]
formatted_prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
input_ids = tokenizer(
    formatted_prompt,
    add_special_tokens=False
).input_ids
prompt = TokensPrompt(prompt_token_ids=input_ids)
output = llm.generate(
    prompt,
    sampling_params
)
generated_text = output[0].outputs[0].text
print(generated_text)
```

# Use with GGML / llama.cpp

The model and instructions for usage in GGUF format are available at [INSAIT-Institute/MamayLM-Gemma-3-12B-IT-v1.0-GGUF](https://huggingface.co/INSAIT-Institute/MamayLM-Gemma-3-12B-IT-v1.0-GGUF).

# Community Feedback

We welcome feedback from the community to help improve MamayLM. If you have suggestions, encounter any issues, or have ideas for improvements, please:
- Share your experience using the model through Hugging Face's community discussion feature or
- Contact us at [contact@insait.ai](mailto:contact@insait.ai)

Your real-world usage and insights are valuable in helping us optimize the model's performance and behaviour for various use cases.

# Summary
- **Finetuned from:** [google/gemma-3-12b-it](https://huggingface.co/google/gemma-3-12b-it); [google/gemma-3-12b-pt](https://huggingface.co/google/gemma-3-12b-pt);
- **Model type:** Causal decoder-only transformer language model
- **Language:** Ukrainian and English
- **Contact:** [contact@insait.ai](mailto:contact@insait.ai)
- **License:** MamayLM is distributed under [Gemma Terms of Use](https://huggingface.co/INSAIT-Institute/MamayLM-Gemma-3-12B-IT-v1.0/raw/main/LICENSE)