Update README.md
Browse files
README.md
CHANGED
|
@@ -13,7 +13,7 @@ datasets:
|
|
| 13 |
- svamp
|
| 14 |
- multi_arith
|
| 15 |
model-index:
|
| 16 |
-
- name: SIM_COT-LLaMA3-CODI-
|
| 17 |
results:
|
| 18 |
- task:
|
| 19 |
type: math-word-problems
|
|
@@ -44,7 +44,7 @@ model-index:
|
|
| 44 |
value: xx.x
|
| 45 |
---
|
| 46 |
|
| 47 |
-
# ๐ SIM_COT-LLaMA3-CODI-
|
| 48 |
|
| 49 |
[](https://huggingface.co/internlm/SIM_COT-LLaMA3-CODI-8B)
|
| 50 |
[](https://github.com/InternLM/SIM-CoT)
|
|
@@ -66,7 +66,7 @@ Empirical results demonstrate that SIM-CoT substantially improves both **in-doma
|
|
| 66 |
|
| 67 |
---
|
| 68 |
|
| 69 |
-
**SIM_COT-LLaMA3-CODI-
|
| 70 |
It is designed to improve โจ *implicit reasoning* and ๐งฎ *arithmetic multi-step problem solving* across benchmarks such as **GSM8K, GSM-Hard, MultiArith, and SVAMP**.
|
| 71 |
|
| 72 |
---
|
|
@@ -98,9 +98,9 @@ We evaluate **SIM-CoT** across both **in-domain** (GSM8K-Aug) and **out-of-domai
|
|
| 98 |
|
| 99 |
## ๐ Model Details
|
| 100 |
|
| 101 |
-
- ๐๏ธ **Base model**: [LLaMA-3.
|
| 102 |
- โก **Fine-tuning method**: LoRA (r=128, alpha=32)
|
| 103 |
-
- ๐ **Latent reasoning**: 6 latent steps, projection dimension =
|
| 104 |
- ๐ฏ **Dropout**: 0.0 (projection layer)
|
| 105 |
- ๐ฅ๏ธ **Precision**: bf16
|
| 106 |
- ๐ **Context length**: 512 tokens
|
|
@@ -132,32 +132,32 @@ cd SIM-CoT/CODI
|
|
| 132 |
|
| 133 |
### 2. Run the evaluation script
|
| 134 |
We provide shell scripts for different backbones and datasets.
|
| 135 |
-
For example, to evaluate on **LLaMA-3.
|
| 136 |
```
|
| 137 |
-
bash
|
| 138 |
```
|
| 139 |
This will internally call the following command:
|
| 140 |
```
|
| 141 |
python test.py \
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
|
| 159 |
-
|
| 160 |
-
|
| 161 |
```
|
| 162 |
### 3. Expected output
|
| 163 |
After running, the script will print the evaluation summary.
|
|
|
|
| 13 |
- svamp
|
| 14 |
- multi_arith
|
| 15 |
model-index:
|
| 16 |
+
- name: SIM_COT-LLaMA3-CODI-1B
|
| 17 |
results:
|
| 18 |
- task:
|
| 19 |
type: math-word-problems
|
|
|
|
| 44 |
value: xx.x
|
| 45 |
---
|
| 46 |
|
| 47 |
+
# ๐ SIM_COT-LLaMA3-CODI-1B
|
| 48 |
|
| 49 |
[](https://huggingface.co/internlm/SIM_COT-LLaMA3-CODI-8B)
|
| 50 |
[](https://github.com/InternLM/SIM-CoT)
|
|
|
|
| 66 |
|
| 67 |
---
|
| 68 |
|
| 69 |
+
**SIM_COT-LLaMA3-CODI-1B** is a large implicit language model based on **Meta LLaMA-3.2-1B-Instruct**, fine-tuned with **SIM-CoT (Supervised Implicit Chain-of-Thought)** on top of the **CODI latent reasoning framework**.
|
| 70 |
It is designed to improve โจ *implicit reasoning* and ๐งฎ *arithmetic multi-step problem solving* across benchmarks such as **GSM8K, GSM-Hard, MultiArith, and SVAMP**.
|
| 71 |
|
| 72 |
---
|
|
|
|
| 98 |
|
| 99 |
## ๐ Model Details
|
| 100 |
|
| 101 |
+
- ๐๏ธ **Base model**: [LLaMA-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)
|
| 102 |
- โก **Fine-tuning method**: LoRA (r=128, alpha=32)
|
| 103 |
+
- ๐ **Latent reasoning**: 6 latent steps, projection dimension = 2048
|
| 104 |
- ๐ฏ **Dropout**: 0.0 (projection layer)
|
| 105 |
- ๐ฅ๏ธ **Precision**: bf16
|
| 106 |
- ๐ **Context length**: 512 tokens
|
|
|
|
| 132 |
|
| 133 |
### 2. Run the evaluation script
|
| 134 |
We provide shell scripts for different backbones and datasets.
|
| 135 |
+
For example, to evaluate on **LLaMA-3.2 1B** with the **SVAMP** dataset, run:
|
| 136 |
```
|
| 137 |
+
bash test_llama1b.sh
|
| 138 |
```
|
| 139 |
This will internally call the following command:
|
| 140 |
```
|
| 141 |
python test.py \
|
| 142 |
+
--data_name "svamp" \
|
| 143 |
+
--output_dir "$SAVE_DIR" \
|
| 144 |
+
--model_name_or_path path/to/Llama-3.2-1B-Instruct \
|
| 145 |
+
--seed 11 \
|
| 146 |
+
--model_max_length 512 \
|
| 147 |
+
--bf16 \
|
| 148 |
+
--lora_r 128 --lora_alpha 32 --lora_init \
|
| 149 |
+
--batch_size 128 \
|
| 150 |
+
--greedy True \
|
| 151 |
+
--num_latent 6 \
|
| 152 |
+
--use_prj True \
|
| 153 |
+
--prj_dim 2048 \
|
| 154 |
+
--prj_no_ln False \
|
| 155 |
+
--prj_dropout 0.0 \
|
| 156 |
+
--inf_latent_iterations 6 \
|
| 157 |
+
--inf_num_iterations 1 \
|
| 158 |
+
--remove_eos True \
|
| 159 |
+
--use_lora True \
|
| 160 |
+
--ckpt_dir path/to/sim_cot-checkpoints
|
| 161 |
```
|
| 162 |
### 3. Expected output
|
| 163 |
After running, the script will print the evaluation summary.
|