Add comprehensive model card with usage instructions and evaluation results
Browse files
README.md
CHANGED
|
@@ -39,8 +39,8 @@ This LoRA adapter enhances google/gemma-3-1b-it with structured reasoning capabi
|
|
| 39 |
- **Training Method**: GRPO (Group Relative Policy Optimization)
|
| 40 |
- **LoRA Rank**: 64
|
| 41 |
- **LoRA Alpha**: 128
|
| 42 |
-
- **Training Samples**:
|
| 43 |
-
- **Thinking Tag Usage**:
|
| 44 |
- **Average Quality Score**: 0.00
|
| 45 |
|
| 46 |
## 🔧 Usage
|
|
@@ -68,7 +68,7 @@ Problem: If a train travels 120 miles in 2 hours, then increases its speed by 30
|
|
| 68 |
Response:'''
|
| 69 |
|
| 70 |
inputs = tokenizer(prompt, return_tensors="pt")
|
| 71 |
-
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.
|
| 72 |
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
| 73 |
print(response)
|
| 74 |
```
|
|
@@ -131,7 +131,7 @@ The model was trained on self-generated reasoning problems across multiple domai
|
|
| 131 |
## 🔬 Evaluation
|
| 132 |
|
| 133 |
The adapter was evaluated on diverse reasoning tasks:
|
| 134 |
-
- Thinking tag usage rate:
|
| 135 |
- Average reasoning quality score: 0.00
|
| 136 |
- Response comprehensiveness: 0 words average
|
| 137 |
|
|
|
|
| 39 |
- **Training Method**: GRPO (Group Relative Policy Optimization)
|
| 40 |
- **LoRA Rank**: 64
|
| 41 |
- **LoRA Alpha**: 128
|
| 42 |
+
- **Training Samples**: 107
|
| 43 |
+
- **Thinking Tag Usage**: 40.0%
|
| 44 |
- **Average Quality Score**: 0.00
|
| 45 |
|
| 46 |
## 🔧 Usage
|
|
|
|
| 68 |
Response:'''
|
| 69 |
|
| 70 |
inputs = tokenizer(prompt, return_tensors="pt")
|
| 71 |
+
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2)
|
| 72 |
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
| 73 |
print(response)
|
| 74 |
```
|
|
|
|
| 131 |
## 🔬 Evaluation
|
| 132 |
|
| 133 |
The adapter was evaluated on diverse reasoning tasks:
|
| 134 |
+
- Thinking tag usage rate: 40.0%
|
| 135 |
- Average reasoning quality score: 0.00
|
| 136 |
- Response comprehensiveness: 0 words average
|
| 137 |
|