Update README.md
Browse files
README.md
CHANGED
|
@@ -23,6 +23,9 @@ This repo contains the FP8 version of **Qwen3-4B**, which has the following feat
|
|
| 23 |
|
| 24 |
For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [blog](https://qwenlm.github.io/blog/qwen3/), [GitHub](https://github.com/QwenLM/Qwen3), and [Documentation](https://qwen.readthedocs.io/en/latest/).
|
| 25 |
|
|
|
|
|
|
|
|
|
|
| 26 |
## Quickstart
|
| 27 |
|
| 28 |
The code of Qwen3 has been in the latest Hugging Face `transformers` and we advise you to use the latest version of `transformers`.
|
|
@@ -126,7 +129,7 @@ However, please pay attention to the following known issues:
|
|
| 126 |
|
| 127 |
> [!TIP]
|
| 128 |
> The `enable_thinking` switch is also available in APIs created by vLLM and SGLang.
|
| 129 |
-
> Please refer to
|
| 130 |
|
| 131 |
### `enable_thinking=True`
|
| 132 |
|
|
|
|
| 23 |
|
| 24 |
For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [blog](https://qwenlm.github.io/blog/qwen3/), [GitHub](https://github.com/QwenLM/Qwen3), and [Documentation](https://qwen.readthedocs.io/en/latest/).
|
| 25 |
|
| 26 |
+
> [!TIP]
|
| 27 |
+
> If you encounter significant endless repetitions, please refer to the [Best Practices](#best-practices) section for optimal sampling parameters, and set the ``presence_penalty`` to 1.5.
|
| 28 |
+
|
| 29 |
## Quickstart
|
| 30 |
|
| 31 |
The code of Qwen3 has been in the latest Hugging Face `transformers` and we advise you to use the latest version of `transformers`.
|
|
|
|
| 129 |
|
| 130 |
> [!TIP]
|
| 131 |
> The `enable_thinking` switch is also available in APIs created by vLLM and SGLang.
|
| 132 |
+
> Please refer to our documentation for [vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes) and [SGLang](https://qwen.readthedocs.io/en/latest/deployment/sglang.html#thinking-non-thinking-modes) users.
|
| 133 |
|
| 134 |
### `enable_thinking=True`
|
| 135 |
|