AyurParam
BharatGen introduces AyurParam, a domain-specialized large language model fine-tuned from Param-1-2.9B-Instruct on a high-quality Ayurveda dataset. It is designed to handle Ayurvedic queries, classical text interpretation, clinical guidance, and wellness knowledge. Ayurveda offers vast traditional medical wisdom, yet most language models lack domain-specific understanding. AyurParam bridges this gap by combining Param-1’s bilingual strengths with a curated Ayurvedic knowledge base, enabling contextually rich and culturally grounded responses.
🏗 Model Architecture
AyurParam inherits the architecture of Param-1-2.9B-Instruct:
- Hidden size: 204
- Intermediate size: 7168
- Attention heads: 16
- Hidden layers: 32
- Key-value heads: 8
- Max position embeddings: 2048
- Activation: SiLU
- Positional Embeddings: Rotary (RoPE, theta=10000)
- Attention Mechanism: Grouped-query attention
- Precision: bf16-mixed
- Base model: Param-1-2.9B-Instruct
📚 AyurParam Dataset Preparation
AyurParam’s dataset was meticulously curated to capture the depth of Ayurvedic wisdom, ensure bilingual accessibility (English + Hindi), and support diverse clinical and academic applications. The preparation process focused on authenticity, quality, and relevance.
🔎 Data Sources
Total Books Collected: ~1000
- ~0.15M Pages, ~54.5M words
- 600 from open-source archives (digitized classical texts)
- 400 from internet sources covering specialized Ayurvedic domains
Domains Covered (examples):
- Kaaychikitsa (कायचिकित्सा)
- Panchkarma (पंचकर्म)
- Shalya Tantra (शल्यतंत्र)
- Shalakya Tantra (शालाक्यतंत्र)
- Research Methodology
- Ashtang Hruday (अष्टांगहृदय)
- Kriya Shaarir (क्रिया शारीर)
- Padarth Vigyan (पदार्थ विज्ञान)
- Rachana Shaarir (रचना शारीर)
- Charak Samhita (चरक संहिता)
- Dravyaguna (द्रव्यगुण)
- Rasa Shastra & Bhaishajya Kalpana (रसशास्त्र एवम भैषज्यकल्पना)
- Rog Nidan (रोगनिदान)
- AgadTantra (अगदतंत्र)
- Balrog (बालरोग)
- Strirog & Prasuti Tantra (स्त्रीरोग एवम प्रसूति तंत्र)
- Swasthvrutta (स्वस्थवृत्त)
- Sanskrit grammar, commentaries, and supporting texts
- etc
🧩 Data Processing Pipeline
1. Source Gathering
- Collected and digitized 1000 Ayurvedic books across classical, clinical, and academic domains.
- Preserved Sanskrit terminology with transliteration and contextual explanation
2. Question–Answer Generation
- Method: By-page Q&A generation using an open-source LLM.
- Focus: Only Ayurveda-related, context-grounded questions.
- Review: Domain expert validation for accuracy and clarity.
3. Taxonomy
- Dosha, Dhatu, Mala, Srotas, Nidana, Chikitsa, etc.
4. Final Dataset Construction
- Q&A Types:
- General Q&A – direct knowledge-based
- Thinking Q&A – reasoning and application-oriented
- Objective Q&A – fact-check, MCQ, structured answers
- Languages: English + Hindi
- Training Samples: ~4.8 Million (all combined)
- Includes single-turn and multi-turn conversations
🏋️ Training Setup
- Base model: Param-1-2.9B-Instruct
- Training framework: Hugging Face + TRL (SFT) + torchrun multi-node setup
- Prompt template: Custom-designed for Ayurvedic inference
- Scheduler: Linear with warmup
- Epochs: 3
- Total training samples: ~4.8M
- Test samples: ~800k
- Base learning rate: 5e-6
- Minimum learning rate: 0
- Additional tokens:
<user>, <assistant>, <context>, <system_prompt>, <actual_response>, </actual_response>
- Vocab size: 256k + 4
- Global batch size: 1024
- Micro batch size: 4
- Gradient accumulation steps: 32
🚀 Inference Example
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "bharatgenai/AyurParam"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=False)
model = AutoModelForCausalLM.from_pretrained(
model_name,
trust_remote_code=True,
torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.bfloat32,
device_map="auto"
)
user_input = "What is the Samprapti (pathogenesis) of Amavata according to Ayurveda?"
prompt = f"<user> {user_input} <assistant>"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=300,
do_sample=True,
top_k=50,
top_p=0.95,
temperature=0.6,
eos_token_id=tokenizer.eos_token_id,
use_cache=False
)
print(tokenizer.decode(output[0], skip_special_tokens=True))
📊 Benchmark Results: Ayur Param vs Baselines
1. Overall Performance
Similar Range Models
| Model |
bba |
bba_English |
bba_Hindi |
| AyurParam-2.9B-Instruct |
39.97 |
41.12 |
38.04 |
| Llama-3.2-3B-Instruct |
33.20 |
35.31 |
29.67 |
| Qwen2.5-3B-Instruct |
32.68 |
35.22 |
28.46 |
| granite-3.1-2b |
31.10 |
33.39 |
27.30 |
| gemma-2-2b-it |
28.40 |
29.38 |
26.79 |
| Llama-3.2-1B-Instruct |
26.41 |
26.77 |
25.82 |
Larger Models
| Model |
bba |
bba_English |
bba_Hindi |
| AyurParam-2.9B-Instruct |
39.97 |
41.12 |
38.04 |
| gemma-2-27b-it |
37.99 |
40.45 |
33.89 |
| Pangea-7B |
37.41 |
40.69 |
31.93 |
| gpt-oss-20b |
36.34 |
38.30 |
33.09 |
| Indic-gemma-7B-Navarasa-2.0 |
35.13 |
37.12 |
31.83 |
| Llama-3.1-8B-Instruct |
34.76 |
36.86 |
31.26 |
| Nemotron-4-Mini-Hindi-4B-Instruct |
33.54 |
33.38 |
33.82 |
| aya-23-8B |
31.97 |
33.84 |
28.87 |
2. Question Difficulty
Similar Range Models
| Difficulty |
AyurParam-2.9B-Instruct |
Llama-3.2-3B |
Qwen2.5-3B |
granite-3.1-2b |
gemma-2-2b-it |
Llama-3.2-1B |
| Easy |
43.93 |
36.42 |
35.55 |
33.90 |
29.96 |
27.44 |
| Medium |
35.95 |
29.66 |
29.57 |
28.06 |
26.83 |
25.23 |
| Hard |
31.21 |
28.51 |
28.23 |
26.81 |
24.96 |
25.39 |
Larger Models
| Difficulty |
AyurParam-2.9B-Instruct |
gemma-2-27b-it |
Pangea-7B |
gpt-oss-20b |
Llama-3.1-8B |
Indic-gemma-7B |
Nemotron-4-Mini-Hindi-4B |
aya-23-8B |
| Easy |
43.93 |
43.47 |
41.45 |
42.03 |
39.43 |
38.54 |
36.08 |
35.51 |
| Medium |
35.95 |
31.90 |
32.94 |
30.27 |
29.36 |
31.72 |
30.80 |
28.29 |
| Hard |
31.21 |
30.78 |
31.77 |
26.67 |
30.50 |
27.23 |
29.50 |
25.11 |
3. Question Type
Similar Range Models
| Type |
Llama-3.2-1B |
Qwen2.5-3B |
Llama-3.2-3B |
AyurParam-2.9B-Instruct |
granite-3.1-2b |
gemma-2-2b-it |
| Assertion/Reasoning |
59.26 |
51.85 |
40.74 |
44.44 |
33.33 |
33.33 |
| Fill in the blanks |
26.97 |
29.21 |
34.83 |
29.78 |
21.35 |
32.02 |
| MCQ |
26.34 |
32.70 |
33.17 |
40.12 |
31.22 |
28.33 |
| Match the column |
26.83 |
29.27 |
29.27 |
24.39 |
29.27 |
36.59 |
Larger Models
| Type |
Indic-gemma-7B |
Pangea-7B |
gemma-2-27b-it |
AyurParam-2.9B-Instruct |
Nemotron-4-Mini-Hindi-4B |
gpt-oss-20b |
Llama-3.1-8B |
aya-23-8B |
| Assertion/Reasoning |
59.26 |
62.96 |
55.56 |
44.44 |
37.04 |
25.93 |
29.63 |
18.52 |
| Fill in the blanks |
35.39 |
24.16 |
35.96 |
29.78 |
30.34 |
32.02 |
26.97 |
30.90 |
| MCQ |
35.10 |
37.53 |
37.98 |
40.12 |
33.60 |
36.39 |
34.83 |
32.05 |
| Match the column |
31.71 |
34.15 |
39.02 |
24.39 |
24.39 |
46.34 |
46.34 |
17.07 |
From the above results, AyurParam not only outperforms all similar-sized models but also achieves competitive or better performance than larger models across multiple metrics.
Contact
For any questions or feedback, please contact: