MagicQuant GGUF Hybrids - Seed OSS 36B Instruct

MagicQuant is an automated quantization, benchmarking, and evolutionary hybrid-GGUF search system for LLMs.

Each release includes models optimized to outperform standard baseline quants (Q8, Q6, Q5, Q4). If a baseline GGUF exists in this repo, the evolutionary engine couldn’t beat it. If a baseline is missing, it’s because a hybrid configuration outperformed it so completely that including the baseline would've been pointless.

These hybrid GGUFs are built to be as small, fast, and low-drift as possible while preserving model capability.

To dive deeper into how MagicQuant works, see the main repo: MagicQuant on GitHub (by MagicCodingMan)

Notes:

  • The HuggingFace hardware compatibility where it shows the bits is usually wrong. It doesn't understand hybrid mixes, so don't trust it.
  • Naming scheme can be found on the MagicQuant Wiki.
  • (tips) Less precision loss means less brain damage. More TPS means faster! Smaller is always better right?

Precision Loss Guide

  • 0–0.1% → God-tier, scientifically exact
  • 0.1–1% → True near-lossless, agent-ready
  • 1–3% → Minimal loss, great for personal use
  • 3–5% → Borderline, but still functional
  • 5%+ → Toys, not tools, outside MagicQuant’s scope

Learn more about precision loss here.

Table - File Size + TPS + Avg Precision Loss

model_name file_size_gb bench_tps avg_prec_loss
mxfp4_moe-HK-B16-EO-Q5K-QUD-Q8_0 39.71 17.73 0.0213%
mxfp4_moe-O-MXFP4-EHQKUD-Q8_0 35.78 18.72 0.0272%
mxfp4_moe-E-B16-D-IQ4NL-KOU-Q6K-HQ-Q8_0 28.02 24.27 0.1768%
mxfp4_moe-EHQKOUD-Q6K 27.63 23.34 0.2037%
mxfp4_moe-EHQKOUD-IQ4NL 18.95 32.00 0.2709%
mxfp4_moe-HQKU-IQ4NL-EOD-MXFP4 18.66 26.90 0.7098%
MXFP4_MOE 17.90 20.46 2.7338%

Table - PPL Columns

model_name gen gen_er code code_er math math_er
mxfp4_moe-HK-B16-EO-Q5K-QUD-Q8_0 6.8901 0.1680 1.4127 0.0095 5.4434 0.1208
mxfp4_moe-O-MXFP4-EHQKUD-Q8_0 6.8866 0.1679 1.4130 0.0095 5.4474 0.1210
mxfp4_moe-E-B16-D-IQ4NL-KOU-Q6K-HQ-Q8_0 6.8901 0.1682 1.4156 0.0096 5.4284 0.1203
mxfp4_moe-EHQKOUD-Q6K 6.9012 0.1685 1.4135 0.0095 5.4637 0.1218
mxfp4_moe-EHQKOUD-IQ4NL 6.8712 0.1654 1.4162 0.0095 5.4627 0.1201
mxfp4_moe-HQKU-IQ4NL-EOD-MXFP4 6.8452 0.1639 1.4140 0.0094 5.5223 0.1222
MXFP4_MOE 7.1007 0.1728 1.4351 0.0097 5.6360 0.1239

Table - Precision Loss Columns

model_name loss_general loss_code loss_math
mxfp4_moe-HK-B16-EO-Q5K-QUD-Q8_0 0.0421 0.0071 0.0147
mxfp4_moe-O-MXFP4-EHQKUD-Q8_0 0.0087 0.0142 0.0588
mxfp4_moe-O-IQ4NL-EHQKUD-Q8_0 0.0087 0.0142 0.0588
mxfp4_moe-E-B16-D-IQ4NL-KOU-Q6K-HQ-Q8_0 0.0421 0.1982 0.2902
mxfp4_moe-EHQKOUD-Q6K 0.2033 0.0495 0.3582
mxfp4_moe-EHQKOUD-IQ4NL 0.2323 0.2407 0.3398
mxfp4_moe-HQKU-IQ4NL-EOD-MXFP4 0.6098 0.0849 1.4346
MXFP4_MOE 3.1000 1.5784 3.5230

Baseline Models (Reference)

Table - File Size + TPS + Avg Precision Loss

model_name file_size_gb bench_tps avg_prec_loss
BF16 67.35 11.48 0.0000%
Q8_0 35.78 17.77 0.0272%
Q6_K 27.63 22.95 0.2037%
Q5_K 23.84 22.04 0.2923%
IQ4_NL 19.31 27.70 1.1076%
MXFP4_MOE 17.90 20.46 2.7338%
Q4_K_M 20.27 26.65 2.9161%

Table - PPL Columns

model_name gen gen_er code code_er math math_er
BF16 6.8872 0.1679 1.4128 0.0095 5.4442 0.1209
Q8_0 6.8866 0.1679 1.4130 0.0095 5.4474 0.1210
Q6_K 6.9012 0.1685 1.4135 0.0095 5.4637 0.1218
Q5_K 6.9056 0.1685 1.4169 0.0096 5.4616 0.1213
IQ4_NL 6.9599 0.1703 1.4235 0.0097 5.5264 0.1235
MXFP4_MOE 7.1007 0.1728 1.4351 0.0097 5.6360 0.1239
Q4_K_M 7.0970 0.1760 1.4235 0.0098 5.7134 0.1305

Table - Precision Loss Columns

model_name loss_general loss_code loss_math
BF16 0.0000 0.0000 0.0000
Q8_0 0.0087 0.0142 0.0588
Q6_K 0.2033 0.0495 0.3582
Q5_K 0.2672 0.2902 0.3196
IQ4_NL 1.0556 0.7574 1.5099
MXFP4_MOE 3.1000 1.5784 3.5230
Q4_K_M 3.0462 0.7574 4.9447

Support

I’m a solo developer working full time for myself to achieve my dream, pouring nights and weekends into open protocols and tools that I hope make the world a little better. If you chip in, you're helping me keep the lights on while I keep shipping.

Click here to see ways to support - BTC, Paypal, GitHub sponsors.

Or, just drop a like on the repo :)

Downloads last month
1,470
GGUF
Model size
36B params
Architecture
seed_oss
Hardware compatibility
Log In to view the estimation

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for magiccodingman/Seed-OSS-36B-Instruct-unsloth-MagicQuant-Hybrid-GGUF

Quantized
(2)
this model

Collection including magiccodingman/Seed-OSS-36B-Instruct-unsloth-MagicQuant-Hybrid-GGUF