lbourdois commited on
Commit
cbcdd4b
·
verified ·
1 Parent(s): 0ce348e

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +64 -51
README.md CHANGED
@@ -1,51 +1,64 @@
1
- ---
2
- base_model:
3
- - Qwen/Qwen2.5-14B
4
- - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
5
- - arcee-ai/Virtuoso-Small-v2
6
- - Qwen/Qwen2.5-14B-Instruct
7
- library_name: transformers
8
- tags:
9
- - mergekit
10
- - merge
11
-
12
- ---
13
- # CoderO1-DeepSeekR1-14B-Preview (Not limited to coding, it is just a name)
14
-
15
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
-
17
- This is based on the work of [FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview).
18
-
19
- ## Merge Details
20
- ### Merge Method
21
-
22
- This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) as a base.
23
-
24
- ### Models Merged
25
-
26
- The following models were included in the merge:
27
- * [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
28
- * [arcee-ai/Virtuoso-Small-v2](https://huggingface.co/arcee-ai/Virtuoso-Small-v2)
29
- * [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
30
-
31
- ### Configuration
32
-
33
- The following YAML configuration was used to produce this model:
34
-
35
- ```yaml
36
- models:
37
- # Pivot model
38
- - model: Qwen/Qwen2.5-14B
39
- # Target models
40
- - model: Qwen/Qwen2.5-14B-Instruct
41
- - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
42
- - model: arcee-ai/Virtuoso-Small-v2
43
- merge_method: sce
44
- base_model: Qwen/Qwen2.5-14B
45
- tokenizer:
46
- source: union
47
- parameters:
48
- select_topk: 1.0
49
- dtype: bfloat16
50
-
51
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-14B
4
+ - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
5
+ - arcee-ai/Virtuoso-Small-v2
6
+ - Qwen/Qwen2.5-14B-Instruct
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+ language:
12
+ - zho
13
+ - eng
14
+ - fra
15
+ - spa
16
+ - por
17
+ - deu
18
+ - ita
19
+ - rus
20
+ - jpn
21
+ - kor
22
+ - vie
23
+ - tha
24
+ - ara
25
+ ---
26
+ # CoderO1-DeepSeekR1-14B-Preview (Not limited to coding, it is just a name)
27
+
28
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
29
+
30
+ This is based on the work of [FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview).
31
+
32
+ ## Merge Details
33
+ ### Merge Method
34
+
35
+ This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) as a base.
36
+
37
+ ### Models Merged
38
+
39
+ The following models were included in the merge:
40
+ * [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
41
+ * [arcee-ai/Virtuoso-Small-v2](https://huggingface.co/arcee-ai/Virtuoso-Small-v2)
42
+ * [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
43
+
44
+ ### Configuration
45
+
46
+ The following YAML configuration was used to produce this model:
47
+
48
+ ```yaml
49
+ models:
50
+ # Pivot model
51
+ - model: Qwen/Qwen2.5-14B
52
+ # Target models
53
+ - model: Qwen/Qwen2.5-14B-Instruct
54
+ - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
55
+ - model: arcee-ai/Virtuoso-Small-v2
56
+ merge_method: sce
57
+ base_model: Qwen/Qwen2.5-14B
58
+ tokenizer:
59
+ source: union
60
+ parameters:
61
+ select_topk: 1.0
62
+ dtype: bfloat16
63
+
64
+ ```