Update README.md
Browse files
README.md
CHANGED
|
@@ -49,6 +49,30 @@ a compact Mixture-of-Experts (MoE) model
|
|
| 49 |
|
| 50 |
---
|
| 51 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
## Dataset Preparation
|
| 53 |
### Data Sources
|
| 54 |
- **Total collected:** 561M samples from **53 datasets** from Hugging Face.
|
|
@@ -151,29 +175,7 @@ Accuracy: **(Phi-mini-MoE) 21.03 → (IndicPhi-mini) 24.46 (+3.43%)**
|
|
| 151 |
|
| 152 |
Accuracy: **(Phi-mini-MoE) 27.47 → (IndicPhi-mini) 30.95 (+3.48%)**
|
| 153 |
|
| 154 |
-
## Usage
|
| 155 |
-
To load the fine-tuned model:
|
| 156 |
-
|
| 157 |
-
```python
|
| 158 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 159 |
-
|
| 160 |
-
model_name = "sandlogic/indicphi-mini-moe-v3"
|
| 161 |
|
| 162 |
-
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 163 |
-
model = AutoModelForCausalLM.from_pretrained(
|
| 164 |
-
model_name,
|
| 165 |
-
device_map="auto",
|
| 166 |
-
load_in_4bit=True
|
| 167 |
-
)
|
| 168 |
-
|
| 169 |
-
prompt = "ग्रामीण क्षेत्रों में ऑनलाइन शिक्षा की समस्याएं क्या हैं?"
|
| 170 |
-
|
| 171 |
-
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
|
| 172 |
-
outputs = model.generate(**inputs, max_new_tokens=100)
|
| 173 |
-
|
| 174 |
-
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 175 |
-
|
| 176 |
-
```
|
| 177 |
|
| 178 |
## Acknowledgments
|
| 179 |
|
|
|
|
| 49 |
|
| 50 |
---
|
| 51 |
|
| 52 |
+
## Usage
|
| 53 |
+
To load the fine-tuned model:
|
| 54 |
+
|
| 55 |
+
```python
|
| 56 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 57 |
+
|
| 58 |
+
model_name = "sandlogic/indicphi-mini-moe-v3"
|
| 59 |
+
|
| 60 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 61 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 62 |
+
model_name,
|
| 63 |
+
device_map="auto",
|
| 64 |
+
load_in_4bit=True
|
| 65 |
+
)
|
| 66 |
+
|
| 67 |
+
prompt = "ग्रामीण क्षेत्रों में ऑनलाइन शिक्षा की समस्याएं क्या हैं?"
|
| 68 |
+
|
| 69 |
+
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
|
| 70 |
+
outputs = model.generate(**inputs, max_new_tokens=100)
|
| 71 |
+
|
| 72 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 73 |
+
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
## Dataset Preparation
|
| 77 |
### Data Sources
|
| 78 |
- **Total collected:** 561M samples from **53 datasets** from Hugging Face.
|
|
|
|
| 175 |
|
| 176 |
Accuracy: **(Phi-mini-MoE) 27.47 → (IndicPhi-mini) 30.95 (+3.48%)**
|
| 177 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 178 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 179 |
|
| 180 |
## Acknowledgments
|
| 181 |
|