tawkeed-0.8b
tawkeed-0.8b is an Arabic-first language model built by Tawkeed, fine-tuned for on-device and edge AI deployment.
Forked from Qwen/Qwen3.5-0.8B and fine-tuned on large-scale Arabic corpora, this model is optimized to run natively on Tawkeed devices — delivering fast, private, Arabic-language AI at the edge.
Highlights
- Arabic-first — trained and rigorously tested on Arabic text across diverse domains
- Edge-optimized — sized and tuned to run efficiently on Tawkeed edge hardware
- Production-ready — validated on Tawkeed's Arabic benchmark suite for real-world accuracy
- Bilingual — retains strong English capability from the base model
Model Details
| Property | Value |
|---|---|
| Base Model | Qwen/Qwen3.5-0.8B |
| Parameters | 0.8b |
| Language | Arabic (ar), English (en) |
| License | Apache 2.0 |
| Fine-tuning | Continued pretraining + SFT on Arabic data |
| Deployment | On-device / Edge / Cloud |
Training
This model is fine-tuned through a multi-stage Arabic enhancement pipeline:
- Continued pretraining on Arabic corpora — Wikipedia, CulturaX, OSCAR
- Supervised fine-tuning (SFT) on curated Arabic instruction datasets — OALL, Alpaca-GPT4-Arabic, Aya
- Evaluation on Tawkeed's Arabic benchmark suite to ensure quality across generation, comprehension, and reasoning tasks
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("tawkeed-sa/tawkeed-0.8b")
tokenizer = AutoTokenizer.from_pretrained("tawkeed-sa/tawkeed-0.8b")
messages = [{"role": "user", "content": "ما هي عاصمة المملكة العربية السعودية؟"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Tawkeed Model Family
A complete suite of Arabic AI models — from compact edge models to large-scale MoE — all fine-tuned and tested for Arabic.
| Model | Size | Type |
|---|---|---|
| tawkeed-sa/tawkeed-0.8b | 0.8b | Arabic LLM |
| tawkeed-sa/tawkeed-2b | 2b | Arabic LLM |
| tawkeed-sa/tawkeed-4b | 4b | Arabic LLM |
| tawkeed-sa/tawkeed-9b | 9b | Arabic LLM |
| tawkeed-sa/tawkeed-27b | 27b | Arabic LLM |
| tawkeed-sa/tawkeed-40b | 40b | Arabic LLM |
| tawkeed-sa/tawkeed-27b-MLX | 27b 8-bit | LLM — Apple Silicon (MLX) |
| tawkeed-sa/tawkeed-27b-GGUF | 27b Q8_0 | LLM — Ollama / llama.cpp |
| tawkeed-sa/tawkeed-ocr | — | OCR |
| tawkeed-sa/tawkeed-embedding | — | Embedding |
About Tawkeed
Tawkeed builds Arabic-native AI that runs on the edge. Every model in the family is fine-tuned for Arabic, tested on Arabic benchmarks, and optimized for deployment on Tawkeed devices.
Built by Tawkeed.
- Downloads last month
- 38
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support