Based on Unsloth BF16 GGUF and imatrix file. The quantization is not programatically selected.
I carefully checked every detail of the imatrix statistics and obtain quantization suggestions from Qwen3-235B-A22B/DeepSeek V3.1/Gemini 2.5 Pro/ChatGPT.
Full protection of first 0-2 dense layers.
Full protection of output tensor and embedding layer.
Further compression is possible by llama.cpp.

Quantization details

--output-tensor-type BF16
--token-embedding-type BF16
--tensor-type attn_k_b=MXFP4 --tensor-type blk.[0|1|2|3|4].attn_k_b=BF16
--tensor-type attn_kv_a_mqa=Q4_K --tensor-type blk.[0|1|2].attn_kv_a_mqa=BF16
--tensor-type attn_output=IQ3_XXS --tensor-type blk.[0|1|2|3|4|5].attn_output=BF16 --tensor-type blk.58.attn_output=Q5_K --tensor-type blk.[59|60].attn_output=Q6_K
--tensor-type attn_q_a=Q4_K --tensor-type blk.[0|1|2].attn_q_a=BF16
--tensor-type attn_q_b=Q4_K --tensor-type blk.[0|1|2|3|4|5].attn_q_b=BF16 --tensor-type blk.6.attn_q_b=Q6_K
--tensor-type attn_v_b=Q6_K --tensor-type blk.[0|1|2].attn_v_b=BF16
--tensor-type blk.[0|1|2].ffn_down=BF16
--tensor-type blk.[0|1|2].ffn_up=BF16
--tensor-type blk.[0|1|2].ffn_gate=BF16
--tensor-type ffn_gate_exps=IQ1_S --tensor-type blk.[3|60].ffn_gate_exps=IQ2_XS
--tensor-type ffn_up_exps=IQ1_S --tensor-type blk.[3|60].ffn_up_exps=IQ2_XS
--tensor-type ffn_gate_shexp=Q6_K --tensor-type blk.[3|60].ffn_gate_shexp=BF16
--tensor-type ffn_up_shexp=Q6_K --tensor-type blk.[3|60].ffn_up_shexp=BF16
--tensor-type ffn_down_shexp=Q6_K --tensor-type blk.[3|60].ffn_down_shexp=BF16
--tensor-type ffn_down_exps=IQ1_S
--tensor-type blk.[3|4].ffn_down_exps=BF16
--tensor-type blk.[5|6|7|8|9|33|46|59|60].ffn_down_exps=MXFP4
--tensor-type blk.[25-38,40-45].ffn_down_exps=IQ2_XS
--tensor-type blk.39.ffn_down_exps=IQ2_S

Downloads last month
10
GGUF
Model size
671B params
Architecture
deepseek2
Hardware compatibility
Log In to view the estimation

1-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lovedheart/DeepSeek-V3.1-GGUF-IQ1_S

Quantized
(24)
this model