Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GPTQ-int4
This is a GPTQ INT4 quantized version of Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled.
Please refer to the original model card for details on the model architecture, training data, and capabilities.
Note: While the original fine-tuning focused on text-only reasoning tasks, this model inherits multimodal capabilities from the base Qwen3.5-27B. The vision encoder is preserved and functional for image understanding tasks.
Quantization Details
- Method: GPTQ (4-bit INT4, W4A16)
- Group Size: 128
- Calibration: 1024 samples from C4 dataset
- Vision Encoder: Preserved (not quantized)
- MTP Module: Preserved (not quantized)
Usage with vLLM
Text-only
from vllm import LLM, SamplingParams
llm = LLM(
model="codgician/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GPTQ-int4",
trust_remote_code=True,
max_model_len=4096,
gpu_memory_utilization=0.9,
)
sampling_params = SamplingParams(temperature=0.7, max_tokens=2048)
prompt = "Explain the difference between TCP and UDP protocols."
outputs = llm.generate([prompt], sampling_params)
print(outputs[0].outputs[0].text)
With Image (Multimodal)
from vllm import LLM, SamplingParams
llm = LLM(
model="codgician/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GPTQ-int4",
trust_remote_code=True,
max_model_len=4096,
gpu_memory_utilization=0.9,
)
sampling_params = SamplingParams(temperature=0.7, max_tokens=256)
messages = [
{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}},
{"type": "text", "text": "What is in this image?"}
]
}
]
outputs = llm.chat(messages, sampling_params)
print(outputs[0].outputs[0].text)
Hardware Requirements
| Precision | VRAM (Approx.) |
|---|---|
| INT4 GPTQ | ~18 GB |
Acknowledgements
- Original model by Jackrong
- Base model: Qwen/Qwen3.5-27B
- Quantization performed using GPTQModel
License
Apache 2.0 (inherited from original model)
- Downloads last month
- 5,654
Model tree for codgician/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GPTQ-int4
Base model
Qwen/Qwen3.5-27B