YOYO-AI
/

Qwen2.5-14B-it-restore

Text Generation

Model card Files Files and versions

YOYO-AI commited on Mar 21

Commit

048c7a2

·

verified ·

1 Parent(s): 099eb1f

Update README.md

Files changed (1) hide show

README.md +47 -0

README.md CHANGED Viewed

@@ -15,3 +15,50 @@ tags:
 Combine the three methods of **della**, **ties**, and **model stock** to merge the instruction model with the base model.
 The aim is to solve the problems of **the decline in instruction-following ability** and **mathematical ability** caused by using only the ties merging method or only the della merging method.

 Combine the three methods of **della**, **ties**, and **model stock** to merge the instruction model with the base model.
 The aim is to solve the problems of **the decline in instruction-following ability** and **mathematical ability** caused by using only the ties merging method or only the della merging method.
+```yaml
+models:
+  - model: Qwen/Qwen2.5-14B-instruct
+    parameters:
+      density: 1
+      weight: 1
+      lambda: 0.9
+merge_method: della
+base_model: Qwen/Qwen2.5-14B
+parameters:
+  density: 1
+  weight: 1
+  lambda: 0.9
+  normalize: true
+  int8_mask: true
+dtype: bfloat16
+name: Qwen2.5-14B-della
+```
+```yaml
+models:
+  - model: Qwen/Qwen2.5-14B-instruct
+    parameters:
+      density: 1
+      weight: 1
+merge_method: ties
+base_model: Qwen/Qwen2.5-14B
+parameters:
+  density: 1
+  weight: 1
+  normalize: true
+  int8_mask: true
+dtype: bfloat16
+name: Qwen2.5-14B-ties
+```
+```yaml
+merge_method: model_stock
+base_model: Qwen/Qwen2.5-14B-instruct
+models:
+  - model: Qwen2.5-14B-della
+  - model: Qwen2.5-14B-ties
+dtype: bfloat16
+tokenizer_source: base
+int8_mask: true
+normalize: true
+name: Qwen2.5-14B-it-restore
+```