Update README.md
Browse files
README.md
CHANGED
|
@@ -65,6 +65,8 @@ The model can be used with the following frameworks;
|
|
| 65 |
We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
|
| 66 |
to implement production-ready inference pipelines.
|
| 67 |
|
|
|
|
|
|
|
| 68 |
**_Installation_**
|
| 69 |
|
| 70 |
Make sure you install [`vLLM >= 0.6.4`](https://github.com/vllm-project/vllm/releases/tag/v0.6.4):
|
|
@@ -162,8 +164,7 @@ messages = [
|
|
| 162 |
# note that running this model on GPU requires over 60 GB of GPU RAM
|
| 163 |
llm = LLM(model=model_name, tokenizer_mode="mistral", tensor_parallel_size=8)
|
| 164 |
|
| 165 |
-
sampling_params = SamplingParams(max_tokens=512)
|
| 166 |
-
|
| 167 |
outputs = llm.chat(messages, sampling_params=sampling_params)
|
| 168 |
|
| 169 |
print(outputs[0].outputs[0].text)
|
|
@@ -193,7 +194,7 @@ messages = [
|
|
| 193 |
{"role": "system", "content": "You are a conversational agent that always answers straight to the point, always end your accurate response with an ASCII drawing of a cat."},
|
| 194 |
{"role": "user", "content": "Give me 5 non-formal ways to say 'See you later' in French."},
|
| 195 |
]
|
| 196 |
-
chatbot = pipeline("text-generation", model="mistralai/Mistral-Small-24B-Instruct-2501", max_new_tokens=256)
|
| 197 |
chatbot(messages)
|
| 198 |
```
|
| 199 |
|
|
|
|
| 65 |
We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
|
| 66 |
to implement production-ready inference pipelines.
|
| 67 |
|
| 68 |
+
**Note**: We recommond using a relatively low temperature, such as `temperature=0.15`.
|
| 69 |
+
|
| 70 |
**_Installation_**
|
| 71 |
|
| 72 |
Make sure you install [`vLLM >= 0.6.4`](https://github.com/vllm-project/vllm/releases/tag/v0.6.4):
|
|
|
|
| 164 |
# note that running this model on GPU requires over 60 GB of GPU RAM
|
| 165 |
llm = LLM(model=model_name, tokenizer_mode="mistral", tensor_parallel_size=8)
|
| 166 |
|
| 167 |
+
sampling_params = SamplingParams(max_tokens=512, temperature=0.15)
|
|
|
|
| 168 |
outputs = llm.chat(messages, sampling_params=sampling_params)
|
| 169 |
|
| 170 |
print(outputs[0].outputs[0].text)
|
|
|
|
| 194 |
{"role": "system", "content": "You are a conversational agent that always answers straight to the point, always end your accurate response with an ASCII drawing of a cat."},
|
| 195 |
{"role": "user", "content": "Give me 5 non-formal ways to say 'See you later' in French."},
|
| 196 |
]
|
| 197 |
+
chatbot = pipeline("text-generation", model="mistralai/Mistral-Small-24B-Instruct-2501", max_new_tokens=256, temperature=0.15)
|
| 198 |
chatbot(messages)
|
| 199 |
```
|
| 200 |
|