vllm: error: unrecognized arguments: --enable-reasoning
VLLM_USE_MODELSCOPE=true VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 vllm serve /home/user/model_zoo/Qwen3-Next-80B-A3B-Thinking --port 8000 --tensor-parallel-size 4 --max-model-len 262144 --enable-reasoning --reasoning-parser deepseek_r1
INFO 09-12 11:12:03 [init.py:216] Automatically detected platform cuda.
usage: vllm [-h] [-v] {chat,complete,serve,bench,collect-env,run-batch} ...
vllm: error: unrecognized arguments: --enable-reasoning
(qwen_next) user@share_dev_:$ /home/user/miniconda3/envs/qwen_next/bin/vllm --version
INFO 09-12 11:14:34 [init.py:216] Automatically detected platform cuda.
0.10.2rc3.dev1+g12a8414d8.cu129
The '--enable-reasoning' argument argument was removed from vllm a few versions ago. You just need the --reasoning-parser .
I believe the correct settings are;
--enable-auto-tool-choice --tool-call-parser qwen3_coder --reasoning-parser qwen3