tawkeed-0.8b

tawkeed-0.8b is an Arabic-first language model built by Tawkeed, fine-tuned for on-device and edge AI deployment.

Forked from Qwen/Qwen3.5-0.8B and fine-tuned on large-scale Arabic corpora, this model is optimized to run natively on Tawkeed devices — delivering fast, private, Arabic-language AI at the edge.

Highlights

Arabic-first — trained and rigorously tested on Arabic text across diverse domains
Edge-optimized — sized and tuned to run efficiently on Tawkeed edge hardware
Production-ready — validated on Tawkeed's Arabic benchmark suite for real-world accuracy
Bilingual — retains strong English capability from the base model

Model Details

Property	Value
Base Model	Qwen/Qwen3.5-0.8B
Parameters	0.8b
Language	Arabic (ar), English (en)
License	Apache 2.0
Fine-tuning	Continued pretraining + SFT on Arabic data
Deployment	On-device / Edge / Cloud

Training

This model is fine-tuned through a multi-stage Arabic enhancement pipeline:

Continued pretraining on Arabic corpora — Wikipedia, CulturaX, OSCAR
Supervised fine-tuning (SFT) on curated Arabic instruction datasets — OALL, Alpaca-GPT4-Arabic, Aya
Evaluation on Tawkeed's Arabic benchmark suite to ensure quality across generation, comprehension, and reasoning tasks

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("tawkeed-sa/tawkeed-0.8b")
tokenizer = AutoTokenizer.from_pretrained("tawkeed-sa/tawkeed-0.8b")

messages = [{"role": "user", "content": "ما هي عاصمة المملكة العربية السعودية؟"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Tawkeed Model Family

A complete suite of Arabic AI models — from compact edge models to large-scale MoE — all fine-tuned and tested for Arabic.

Model	Size	Type
tawkeed-sa/tawkeed-0.8b	0.8b	Arabic LLM
tawkeed-sa/tawkeed-2b	2b	Arabic LLM
tawkeed-sa/tawkeed-4b	4b	Arabic LLM
tawkeed-sa/tawkeed-9b	9b	Arabic LLM
tawkeed-sa/tawkeed-27b	27b	Arabic LLM
tawkeed-sa/tawkeed-40b	40b	Arabic LLM
tawkeed-sa/tawkeed-27b-MLX	27b 8-bit	LLM — Apple Silicon (MLX)
tawkeed-sa/tawkeed-27b-GGUF	27b Q8_0	LLM — Ollama / llama.cpp
tawkeed-sa/tawkeed-ocr	—	OCR
tawkeed-sa/tawkeed-embedding	—	Embedding

About Tawkeed

Tawkeed builds Arabic-native AI that runs on the edge. Every model in the family is fine-tuned for Arabic, tested on Arabic benchmarks, and optimized for deployment on Tawkeed devices.

Built by Tawkeed.