dolfsai/Jan-nano-128k-W8A8

This is a compressed version of Menlo/Jan-nano-128k using llm-compressor with the following scheme: W8A8 Usage instructions and details here

Model Details

  • Original Model: Menlo/Jan-nano-128k
  • Quantization Method: GPTQ
  • Compression Libraries: llm-compressor
  • Calibration Dataset: ultrachat_200k (1024 samples)
  • Optimized For: Inference with vLLM
  • License: same as original model
Downloads last month
9
Safetensors
Model size
4B params
Tensor type
BF16
·
I8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dolfsai/Jan-nano-128k-W8A8

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Finetuned
Menlo/Jan-nano
Quantized
(18)
this model

Dataset used to train dolfsai/Jan-nano-128k-W8A8