leejunhyeok commited on
Commit
8b93d6b
·
verified ·
1 Parent(s): e571869

add usecases in readme

Browse files
Files changed (1) hide show
  1. README.md +49 -1
README.md CHANGED
@@ -63,4 +63,52 @@ Detailed information including technical report will be released later.
63
  |---|---|---|---|---|---|---|---|---|
64
  ||Instruct|Instruct|Non-thinking|Thinking|Non-thinking|Thinking|Non-thinking|Thinking|
65
  |Average|67.08|50.95|54.97|77.82|54.66|79.55|54.78|78.66|
66
- |Improvement||+31.65%|+22.02%|-13.80%|+22.72%|-15.68%|+22.45%|-14.73%|
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
  |---|---|---|---|---|---|---|---|---|
64
  ||Instruct|Instruct|Non-thinking|Thinking|Non-thinking|Thinking|Non-thinking|Thinking|
65
  |Average|67.08|50.95|54.97|77.82|54.66|79.55|54.78|78.66|
66
+ |Improvement||+31.65%|+22.02%|-13.80%|+22.72%|-15.68%|+22.45%|-14.73%|
67
+
68
+ ## How to use in transformers
69
+ ```python
70
+ from transformers import AutoModelForCausalLM, AutoTokenizer
71
+ import torch
72
+
73
+ model = AutoModelForCausalLM.from_pretrained(
74
+ "Motif-Technologies/Motif-2-12.7B-Instruct",
75
+ trust_remote_code = True,
76
+ _attn_implementation = "flash_attention_2",
77
+ dtype = torch.bfloat16 # currently supports bf16 only, for efficiency
78
+ ).cuda()
79
+
80
+ tokenizer = AutoTokenizer.from_pretrained(
81
+ "Motif-Technologies/Motif-2-12.7B-Instruct",
82
+ trust_remote_code = True,
83
+ )
84
+
85
+ query = "What is the capital city of South Korea?"
86
+ input_ids = tokenizer.apply_chat_template(
87
+ [
88
+ {'role': 'system', 'content': 'you are an helpful assistant'},
89
+ {'role': 'user', 'content': query},
90
+ ],
91
+ add_generation_prompt = True,
92
+ enable_thinking = False, # or True
93
+ return_tensors='pt',
94
+ ).cuda()
95
+
96
+ output = model.generate(input_ids, max_new_tokens=1024, pad_token_id=tokenizer.eos_token_id)
97
+ output = tokenizer.decode(output[0, input_ids.shape[-1]:], skip_special_tokens = False)
98
+ print(output)
99
+ ```
100
+ ### outputs
101
+ ```
102
+ # with enable_thinking=True, the model is FORCED to think.
103
+ Okay, the user is asking for the capital city of South Korea. Let me think. I know that South Korea's capital is Seoul. But wait, I should double-check to make sure I'm not mixing it up with other countries. For example, North Korea's capital is Pyongyang. So yes, South Korea's capital is definitely Seoul. I should just provide that as the answer.
104
+ </think>
105
+ The capital city of South Korea is **Seoul**.
106
+ <|endofturn|><|endoftext|>
107
+
108
+ # with enable_thinking=False, the model chooses to think or not. in this example, thinking is not worth it.
109
+ The capital city of South Korea is Seoul.
110
+ <|endofturn|><|endoftext|>
111
+ ```
112
+
113
+ ## How to use in vllm
114
+ TBD