YOYO-AI commited on
Commit
048c7a2
·
verified ·
1 Parent(s): 099eb1f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md CHANGED
@@ -15,3 +15,50 @@ tags:
15
  Combine the three methods of **della**, **ties**, and **model stock** to merge the instruction model with the base model.
16
 
17
  The aim is to solve the problems of **the decline in instruction-following ability** and **mathematical ability** caused by using only the ties merging method or only the della merging method.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  Combine the three methods of **della**, **ties**, and **model stock** to merge the instruction model with the base model.
16
 
17
  The aim is to solve the problems of **the decline in instruction-following ability** and **mathematical ability** caused by using only the ties merging method or only the della merging method.
18
+
19
+ ```yaml
20
+ models:
21
+ - model: Qwen/Qwen2.5-14B-instruct
22
+ parameters:
23
+ density: 1
24
+ weight: 1
25
+ lambda: 0.9
26
+ merge_method: della
27
+ base_model: Qwen/Qwen2.5-14B
28
+ parameters:
29
+ density: 1
30
+ weight: 1
31
+ lambda: 0.9
32
+ normalize: true
33
+ int8_mask: true
34
+ dtype: bfloat16
35
+ name: Qwen2.5-14B-della
36
+ ```
37
+ ```yaml
38
+ models:
39
+ - model: Qwen/Qwen2.5-14B-instruct
40
+ parameters:
41
+ density: 1
42
+ weight: 1
43
+ merge_method: ties
44
+ base_model: Qwen/Qwen2.5-14B
45
+ parameters:
46
+ density: 1
47
+ weight: 1
48
+ normalize: true
49
+ int8_mask: true
50
+ dtype: bfloat16
51
+ name: Qwen2.5-14B-ties
52
+ ```
53
+ ```yaml
54
+ merge_method: model_stock
55
+ base_model: Qwen/Qwen2.5-14B-instruct
56
+ models:
57
+ - model: Qwen2.5-14B-della
58
+ - model: Qwen2.5-14B-ties
59
+ dtype: bfloat16
60
+ tokenizer_source: base
61
+ int8_mask: true
62
+ normalize: true
63
+ name: Qwen2.5-14B-it-restore
64
+ ```