mistralai
/

Mistral-Small-24B-Instruct-2501

Model card Files Files and versions

patrickvonplaten commited on Jan 30

Commit

f944fd9

·

verified ·

1 Parent(s): d306c53

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -65,6 +65,8 @@ The model can be used with the following frameworks;
 We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
 to implement production-ready inference pipelines.
 **_Installation_**
 Make sure you install [`vLLM >= 0.6.4`](https://github.com/vllm-project/vllm/releases/tag/v0.6.4):
@@ -162,8 +164,7 @@ messages = [
 # note that running this model on GPU requires over 60 GB of GPU RAM
 llm = LLM(model=model_name, tokenizer_mode="mistral", tensor_parallel_size=8)
-sampling_params = SamplingParams(max_tokens=512)
 outputs = llm.chat(messages, sampling_params=sampling_params)
 print(outputs[0].outputs[0].text)
@@ -193,7 +194,7 @@ messages = [
     {"role": "system", "content": "You are a conversational agent that always answers straight to the point, always end your accurate response with an ASCII drawing of a cat."},
     {"role": "user", "content": "Give me 5 non-formal ways to say 'See you later' in French."},
 ]
-chatbot = pipeline("text-generation", model="mistralai/Mistral-Small-24B-Instruct-2501", max_new_tokens=256)
 chatbot(messages)
 ```

 We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
 to implement production-ready inference pipelines.
+**Note**: We recommond using a relatively low temperature, such as `temperature=0.15`.
 **_Installation_**
 Make sure you install [`vLLM >= 0.6.4`](https://github.com/vllm-project/vllm/releases/tag/v0.6.4):
 # note that running this model on GPU requires over 60 GB of GPU RAM
 llm = LLM(model=model_name, tokenizer_mode="mistral", tensor_parallel_size=8)
+sampling_params = SamplingParams(max_tokens=512, temperature=0.15)
 outputs = llm.chat(messages, sampling_params=sampling_params)
 print(outputs[0].outputs[0].text)
     {"role": "system", "content": "You are a conversational agent that always answers straight to the point, always end your accurate response with an ASCII drawing of a cat."},
     {"role": "user", "content": "Give me 5 non-formal ways to say 'See you later' in French."},
 ]
+chatbot = pipeline("text-generation", model="mistralai/Mistral-Small-24B-Instruct-2501", max_new_tokens=256, temperature=0.15)
 chatbot(messages)
 ```