Model Card for Tamil-R1

Tamil-R1 is the first reasoning model fine-tuned to think natively in Tamil using chain-of-thought prompting. Built on DeepSeek-R1-Distill-Llama-8B with LoRA, it demonstrates proof-of-concept for regional language reasoning at zero cost.

Model Details

Model Description

Tamil-R1 performs step-by-step reasoning in Tamil for everyday decision-making scenarios. Unlike translation-based models, it reasons directly in Tamil with visible tags. Version 0.1 trained on 100 hand-crafted examples as proof of concept.

Developed by: Sushil Narayanan (Independent Research)
Funded by: Self-funded (₹0 cost using free resources)
Shared by: Sushil Narayanan
Model type: Causal Language Model with LoRA adapter
Language(s) (NLP): Tamil (ta)
License: MIT
Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Example Prompts

1. Shopping Decision (Price vs Distance)

Question:

காய்கறி வாங்கணும். பக்கத்து கடை ₹45, தூர கடை ₹35. எது நல்லது?

Tamil-R1 Thinking (in Tamil):

<think>
இரண்டு வழி பார்ப்போம்:

பக்கத்து கடை:
- விலை = ₹45
- நேரம் = 5 நிமிடம்
- வசதியானது

தூர கடை:
- விலை = ₹35
- நேரம் = 30 நிமிடம்
- கஷ்டம்

நேர மதிப்பு கணக்கு:
- ₹10 சேமிப்புக்கு 25 நிமிடம் கூடுதல்
- மணிக்கு ₹24 மதிப்பு மட்டும்
- உன் நேரம் இதுக்கு அதிக மதிப்பு

வசதி முக்கியம்: மழை, அவசர நேரம், வேலை களைப்பு
</think>

Answer:

பக்கத்து கடையே போ. ₹10 சேமிப்புக்கு 25 நிமிடம் பயணம் மதிப்பு இல்ல. நேரம் = பணத்தை விட முக்கியம்.

2. Phone Purchase Decision (Warranty Trade-off)

Question:

புது மொபைல் வாங்கணும். ₹15,000 + 1 வருட வாரண்டி VS ₹13,500 + 6 மாத வாரண்டி. எது?

Tamil-R1 Thinking:

<think>
வாரண்டி மதிப்பு:
- மொபைல் பழுது = ₹3,000-₹5,000
- 6-12 மாதத்தில் பழுது வாய்ப்பு அதிகம்
- ₹1,500 கொடுத்து ₹5,000 செலவு தவிர்ப்பது புத்திசாலித்தனம்

நாளொன்றுக்கு செலவு:
- ₹1,500 ÷ 365 = ₹4 நாளொன்றுக்கு
- மனநிம்மதிக்கு ₹4 = மிக குறைவு
- காப்பீடு மாதிரி நினை
</think>

Answer:

₹15,000 எடு. நாளொன்றுக்கு ₹4 பாதுகாப்பு. 6 மாதம் கூடுதல் வாரண்டி = பெரிய செலவு தவிர்ப்பு + மனநிம்மதி.

3. AC vs Fan Decision (Summer Comfort)

Question:

கோடையில் ரொம்ப வெப்பம். ஏசி போட்டா ₹500 கூடுதல் மாதம். விசிறியில் சமாளிக்கலாமா?

Tamil-R1 Thinking:

<think>
தூக்க தரம்:
- விசிறி = 5-6 மணி புரட்டி புரட்டி
- ஏசி = 7-8 மணி ஆழ்ந்த தூக்கம்
- நல்ல தூக்கம் = அடுத்த நாள் கவனம் கூடும்

நாளொன்றுக்கு செலவு:
- ₹500 ÷ 30 = ₹17 நாளொன்றுக்கு
- 4 பேர் குடும்பம் = நபருக்கு ₹4
- வசதிக்கு ₹4 = மிக குறைவு

கோடை 3-4 மாதங்கள் மட்டும்:
- வருடம் = ₹2,000 மட்டும்
- உடல்நல முதலீடு மாதிரி
</think>

Answer:

ஏசி போடு. உடல்நலம் + தூக்க தரம் + வேலை செயல்திறன் = ₹500 மதிப்பு உள்ளது. கோடை 4 மாதம் மட்டும், நிர்வகிக்கலாம்.

4. Cooking vs Ordering Decision (Work-Life Balance)

Question:

8 மணிக்கு வீட்டுக்கு வந்தேன், சோர்வு. சமைக்க 1 மணி நேரம், ₹80. சுவிகியில் ₹200. என்ன பண்றது?

Tamil-R1 Thinking:

<think>
சோர்வு + நேரம்:
- இப்போ சமை = 9:30 மணி இரவு உணவு
- சுவிகி = 8:45 மணி இரவு உணவு
- தாமத உணவு = ஜீரணம் பாதிக்கும்

நேர மதிப்பு:
- 1 மணி சமையல் வீணு
- அந்த நேரத்தில்: ஓய்வு + குடும்ப நேரம் + மன புத்துணர்ச்சி

₹120 மதிப்பு:
- 80 நிமிடங்கள் சேமிப்பு
- நிமிடத்துக்கு ₹1.5 = நியாயம்
- ஒரு நாள் மட்டும், தினசரி இல்ல
</think>

Answer:

இன்று ஆர்டர் செய், குற்ற உணர்வு வேண்டாம். ₹120 = சோர்வான நாளுக்கு மன அமைதி. சுய-பராமரிப்பு அவசியம். நாளை வீட்டில் சமையலாம்.

5. Simple Math with Cultural Context

Question:

பொங்கலுக்கு 50 பேர் வரவாங்க. ஒருத்தருக்கு 2 இட்லி. எத்தனை இட்லி செய்யணும்?

Tamil-R1 Thinking:

<think>
ஆட்கள் = 50 பேர்
ஒருவருக்கு = 2 இட்லி

மொத்தம் = 50 × 2 = 100 இட்லி

புத்திசாலித்தனமாக யோசிப்போம்:
- சிலருக்கு அதிகம் வேண்டும் (3-4 இட்லி)
- குழந்தைகளுக்கு குறைவாக (1 இட்லி)
- விருந்தில் கூடுதலாக வைப்பது நல்லது
- பத்திரமாக 10-15 இட்லி அதிகம் செய்யலாம்

இறுதி எண்ணிக்கை = 110-115 இட்லி
</think>

Answer: 50 × 2 = 100 இட்லி. ஆனா விருந்துன்னா 10-15 கூடுதல் செய்யலாம். மொத்தம் 110-115 இட்லி பாதுகாப்பானது.

Model Sources

Repository: https://huggingface.co/sushilnarayanan/tamil-r1-reasoning-model
Paper: N/A (Independent project, no paper published)
Demo: https://huggingface.co/spaces/sushilnarayanan/tamil-r1-reasoning-demo

Uses

Direct Use

Use for Tamil everyday reasoning: shopping decisions, cooking planning, daily choices, health routines. Load with PEFT library. Not suitable for production (v0.1 with 100 examples only). See code example in "How to Get Started" section below.

Downstream Use

Can be further fine-tuned for specific Tamil reasoning domains (business, education, healthcare). Suitable as foundation for Tamil conversational agents. Requires scaling dataset to 500+ examples before downstream production use.

Out-of-Scope Use

Not for: complex academic problems, critical real-time decisions, legal/medical advice, multilingual tasks, production systems without additional training, knowledge beyond January 2025.

Bias, Risks, and Limitations

Limited to 100 training examples - high risk of repetitive/incomplete outputs. Cultural bias toward Tamil Nadu context. May not generalize to all Tamil-speaking regions. No evaluation on standard benchmarks. Use only for research/experimentation in v0.1.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
    torch_dtype=torch.float16,
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Llama-8B")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "sushilnarayanan/tamil-r1-reasoning-model")

# Generate response
question = "காய்கறி வாங்கணும், பக்கத்து கடையில₹45, தூரமா இருக்குற கடையில ₹35. எங்க வாங்கலாம்?"
inputs = tokenizer(f"User: {question}\n\nAssistant:", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.9, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

100 hand-crafted Tamil reasoning examples across 5 categories: Shopping/Money, Basic Maths, Home/Cooking, Daily Decisions, Health/Routine, Common Sense. Format: User question → Tamil reasoning → Answer. No external datasets used.

Training Procedure

Preprocessing [optional]

Text formatted as "User: {question}\n\nAssistant: {response}" with tags for reasoning. Tokenized with max length 2048, truncation enabled. No data augmentation applied.

Training Hyperparameters

Training regime: fp16 mixed precision

LoRA Configuration:

Rank: 16
Alpha: 16
Dropout: 0
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Training Parameters:

Epochs: 3
Batch size: 2 per device
Gradient accumulation steps: 4
Effective batch size: 16
Learning rate: 2e-4
Optimizer: AdamW (8-bit)
Warmup steps: 5
Max sequence length: 2048

Speeds, Sizes, Times

Training time: 30 minutes on Kaggle Tesla T4 x2. Model size: 195.8 MB (LoRA adapter only). Inference: ~5-10 seconds per response on T4 GPU. 150 total training steps across 3 epochs.

Evaluation

Testing Data, Factors & Metrics

Testing Data

No separate test set. Model evaluated qualitatively on 3 unseen Tamil questions. Results showed proof-of-concept but limited generalization due to small training size.

Factors

Performance varies by question complexity and domain. Best for everyday scenarios similar to training examples. Struggles with: novel situations, multi-step reasoning requiring >5 steps, culturally unfamiliar contexts, technical/specialized topics.

Metrics

Training loss: Epoch 1 (2.7336) → Epoch 2 (2.1412) → Epoch 3 (2.0722). No standard benchmark scores yet. Qualitative evaluation only in v0.1.

Results

Successfully demonstrates native Tamil reasoning with tags. Model produces grammatically correct Tamil but suffers from repetition and incompleteness. Proof-of-concept achieved; production readiness requires 5-10x more training data.

Summary

Version 0.1 proves native Tamil reasoning is possible but requires significant scaling for practical applications.

Model Examination

Reasoning traces show native Tamil thinking patterns without English mixing. Some repetitive token generation observed. Model successfully learned tag structure. Further interpretability work needed.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator(https://mlco2.github.io/impact#compute) presented in Lacoste et al. (2019)]https://arxiv.org/abs/1910.09700.

Hardware Type: Kaggle Tesla T4 x2 (14.7 GB VRAM)
Hours used: 0.5 hours (30 minutes training time)
Cloud Provider: Kaggle (Google Cloud Platform backend)
Compute Region: US (Kaggle default region)
Carbon Emitted: ~0.014 kg CO2eq (estimated for 30 min T4 GPU usage)

Technical Specifications [optional]

Model Architecture and Objective

Transformer decoder (Llama architecture) with LoRA adapters. Base: 8B parameters, Trainable: 42M parameters (0.52%). LoRA rank 16, alpha 16, targeting q/k/v/o/gate/up/down projection layers.

Compute Infrastructure

Kaggle Notebooks with Tesla T4 x2 GPU, 30GB RAM, Ubuntu OS. Free tier compute quota.

Hardware

GPU: 2x Tesla T4 (14.7 GB VRAM total) CPU: 2 vCPU RAM: 30 GB Storage: 73 GB disk

Software

Python 3.11, PyTorch 2.9.1, Transformers 4.57.2, PEFT 0.14.0, Unsloth 2025.11.4, Accelerate 1.2.1, Datasets 3.2.0

Citation [optional]

BibTeX:

@misc{sushil2025tamilr1,
  author = {sushil},
  title = {Tamil-R1: First Reasoning Model That Thinks in Tamil},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/sushilnarayanan/tamil-r1-reasoning-model}
}

APA:

Sushil Narayanan. (2025). Tamil-R1: First Reasoning Model That Thinks in Tamil. HuggingFace. https://huggingface.co/sushilnarayanan/tamil-r1-reasoning-model

Glossary

LoRA: Low-Rank Adaptation - efficient fine-tuning method that trains small adapter layers
Chain-of-Thought (CoT): Step-by-step reasoning process shown in tags
PEFT: Parameter-Efficient Fine-Tuning library for loading adapters
Native reasoning: Thinking directly in Tamil without translation from English

More Information

For questions or collaboration: LinkedIn https://www.linkedin.com/in/sushil-kumar08/

Roadmap: v0.2 (500 examples), v1.0 (1000 examples + mobile app)

Community contributions welcome for dataset expansion.

Model Card Authors

Sushil Narayanan

Model Card Contact

LinkedIn: https://www.linkedin.com/in/sushil-kumar08/ HuggingFace: https://huggingface.co/sushilnarayanan

Downloads last month: 67

Model Card for Tamil-R1

Model Details

Model Description

Example Prompts

1. Shopping Decision (Price vs Distance)

2. Phone Purchase Decision (Warranty Trade-off)

3. AC vs Fan Decision (Summer Comfort)

4. Cooking vs Ordering Decision (Work-Life Balance)

5. Simple Math with Cultural Context

Model Sources

Uses

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Preprocessing [optional]

Training Hyperparameters

Speeds, Sizes, Times

Evaluation

Testing Data, Factors & Metrics

Testing Data

Factors

Metrics

Results

Summary

Model Examination

Environmental Impact

Technical Specifications [optional]

Model Architecture and Objective

Compute Infrastructure

Hardware

Software

Citation [optional]

Glossary

More Information

Model Card Authors

Model Card Contact

Space using sushilnarayanan/tamil-r1-reasoning-model 1