patrickvonplaten commited on
Commit
f944fd9
·
verified ·
1 Parent(s): d306c53

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -65,6 +65,8 @@ The model can be used with the following frameworks;
65
  We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
66
  to implement production-ready inference pipelines.
67
 
 
 
68
  **_Installation_**
69
 
70
  Make sure you install [`vLLM >= 0.6.4`](https://github.com/vllm-project/vllm/releases/tag/v0.6.4):
@@ -162,8 +164,7 @@ messages = [
162
  # note that running this model on GPU requires over 60 GB of GPU RAM
163
  llm = LLM(model=model_name, tokenizer_mode="mistral", tensor_parallel_size=8)
164
 
165
- sampling_params = SamplingParams(max_tokens=512)
166
-
167
  outputs = llm.chat(messages, sampling_params=sampling_params)
168
 
169
  print(outputs[0].outputs[0].text)
@@ -193,7 +194,7 @@ messages = [
193
  {"role": "system", "content": "You are a conversational agent that always answers straight to the point, always end your accurate response with an ASCII drawing of a cat."},
194
  {"role": "user", "content": "Give me 5 non-formal ways to say 'See you later' in French."},
195
  ]
196
- chatbot = pipeline("text-generation", model="mistralai/Mistral-Small-24B-Instruct-2501", max_new_tokens=256)
197
  chatbot(messages)
198
  ```
199
 
 
65
  We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
66
  to implement production-ready inference pipelines.
67
 
68
+ **Note**: We recommond using a relatively low temperature, such as `temperature=0.15`.
69
+
70
  **_Installation_**
71
 
72
  Make sure you install [`vLLM >= 0.6.4`](https://github.com/vllm-project/vllm/releases/tag/v0.6.4):
 
164
  # note that running this model on GPU requires over 60 GB of GPU RAM
165
  llm = LLM(model=model_name, tokenizer_mode="mistral", tensor_parallel_size=8)
166
 
167
+ sampling_params = SamplingParams(max_tokens=512, temperature=0.15)
 
168
  outputs = llm.chat(messages, sampling_params=sampling_params)
169
 
170
  print(outputs[0].outputs[0].text)
 
194
  {"role": "system", "content": "You are a conversational agent that always answers straight to the point, always end your accurate response with an ASCII drawing of a cat."},
195
  {"role": "user", "content": "Give me 5 non-formal ways to say 'See you later' in French."},
196
  ]
197
+ chatbot = pipeline("text-generation", model="mistralai/Mistral-Small-24B-Instruct-2501", max_new_tokens=256, temperature=0.15)
198
  chatbot(messages)
199
  ```
200