prudant commited on
Commit
c627f91
·
verified ·
1 Parent(s): 66641f6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -12,6 +12,12 @@ pipeline_tag: text-ranking
12
 
13
  This is a compressed version of tomaarsen/Qwen3-Reranker-0.6B-seq-cls using llm-compressor with the following scheme: W8A8
14
 
 
 
 
 
 
 
15
  ## Model Details
16
 
17
  - **Original Model**: tomaarsen/Qwen3-Reranker-0.6B-seq-cls
 
12
 
13
  This is a compressed version of tomaarsen/Qwen3-Reranker-0.6B-seq-cls using llm-compressor with the following scheme: W8A8
14
 
15
+ ## Serving
16
+
17
+ ``python3 -m vllm.entrypoints.openai.api_server --download-dir '/data' --model 'dolfsai/Qwen3-Reranker-0.6B-seq-cls-vllm-W8A8' --task classify``
18
+
19
+ **Important**: You MUST read the following guide for correct usage of this model here [Guide](https://github.com/vllm-project/vllm/pull/19260)
20
+
21
  ## Model Details
22
 
23
  - **Original Model**: tomaarsen/Qwen3-Reranker-0.6B-seq-cls