Jan-nano-128k-W8A8 / README.md
prudant's picture
Update README.md
5af058a verified
metadata
license: apache-2.0
datasets:
  - HuggingFaceH4/ultrachat_200k
language:
  - en
  - es
base_model:
  - Menlo/Jan-nano-128k
pipeline_tag: text-generation
tags:
  - vllm
  - gptq
  - agentic

dolfsai/Jan-nano-128k-W8A8

This is a compressed version of Menlo/Jan-nano-128k using llm-compressor with the following scheme: W8A8 Usage instructions and details here

Model Details

  • Original Model: Menlo/Jan-nano-128k
  • Quantization Method: GPTQ
  • Compression Libraries: llm-compressor
  • Calibration Dataset: ultrachat_200k (1024 samples)
  • Optimized For: Inference with vLLM
  • License: same as original model