MagicQuant GGUF Hybrids - Seed OSS 36B Instruct

MagicQuant is an automated quantization, benchmarking, and evolutionary hybrid-GGUF search system for LLMs.

Each release includes models optimized to outperform standard baseline quants (Q8, Q6, Q5, Q4). If a baseline GGUF exists in this repo, the evolutionary engine couldn’t beat it. If a baseline is missing, it’s because a hybrid configuration outperformed it so completely that including the baseline would've been pointless.

These hybrid GGUFs are built to be as small, fast, and low-drift as possible while preserving model capability.

To dive deeper into how MagicQuant works, see the main repo: MagicQuant on GitHub (by MagicCodingMan)

Notes:

The HuggingFace hardware compatibility where it shows the bits is usually wrong. It doesn't understand hybrid mixes, so don't trust it.
Naming scheme can be found on the MagicQuant Wiki.
(tips) Less precision loss means less brain damage. More TPS means faster! Smaller is always better right?

Precision Loss Guide

0–0.1% → God-tier, scientifically exact
0.1–1% → True near-lossless, agent-ready
1–3% → Minimal loss, great for personal use
3–5% → Borderline, but still functional
5%+ → Toys, not tools, outside MagicQuant’s scope

Learn more about precision loss here.

Table - File Size + TPS + Avg Precision Loss

model_name	file_size_gb	bench_tps	avg_prec_loss
mxfp4_moe-HK-B16-EO-Q5K-QUD-Q8_0	39.71	17.73	0.0213%
mxfp4_moe-O-MXFP4-EHQKUD-Q8_0	35.78	18.72	0.0272%
mxfp4_moe-E-B16-D-IQ4NL-KOU-Q6K-HQ-Q8_0	28.02	24.27	0.1768%
mxfp4_moe-EHQKOUD-Q6K	27.63	23.34	0.2037%
mxfp4_moe-EHQKOUD-IQ4NL	18.95	32.00	0.2709%
mxfp4_moe-HQKU-IQ4NL-EOD-MXFP4	18.66	26.90	0.7098%
MXFP4_MOE	17.90	20.46	2.7338%

Table - PPL Columns

model_name	gen	gen_er	code	code_er	math	math_er
mxfp4_moe-HK-B16-EO-Q5K-QUD-Q8_0	6.8901	0.1680	1.4127	0.0095	5.4434	0.1208
mxfp4_moe-O-MXFP4-EHQKUD-Q8_0	6.8866	0.1679	1.4130	0.0095	5.4474	0.1210
mxfp4_moe-E-B16-D-IQ4NL-KOU-Q6K-HQ-Q8_0	6.8901	0.1682	1.4156	0.0096	5.4284	0.1203
mxfp4_moe-EHQKOUD-Q6K	6.9012	0.1685	1.4135	0.0095	5.4637	0.1218
mxfp4_moe-EHQKOUD-IQ4NL	6.8712	0.1654	1.4162	0.0095	5.4627	0.1201
mxfp4_moe-HQKU-IQ4NL-EOD-MXFP4	6.8452	0.1639	1.4140	0.0094	5.5223	0.1222
MXFP4_MOE	7.1007	0.1728	1.4351	0.0097	5.6360	0.1239

Table - Precision Loss Columns

model_name	loss_general	loss_code	loss_math
mxfp4_moe-HK-B16-EO-Q5K-QUD-Q8_0	0.0421	0.0071	0.0147
mxfp4_moe-O-MXFP4-EHQKUD-Q8_0	0.0087	0.0142	0.0588
mxfp4_moe-O-IQ4NL-EHQKUD-Q8_0	0.0087	0.0142	0.0588
mxfp4_moe-E-B16-D-IQ4NL-KOU-Q6K-HQ-Q8_0	0.0421	0.1982	0.2902
mxfp4_moe-EHQKOUD-Q6K	0.2033	0.0495	0.3582
mxfp4_moe-EHQKOUD-IQ4NL	0.2323	0.2407	0.3398
mxfp4_moe-HQKU-IQ4NL-EOD-MXFP4	0.6098	0.0849	1.4346
MXFP4_MOE	3.1000	1.5784	3.5230

Baseline Models (Reference)

Table - File Size + TPS + Avg Precision Loss

model_name	file_size_gb	bench_tps	avg_prec_loss
BF16	67.35	11.48	0.0000%
Q8_0	35.78	17.77	0.0272%
Q6_K	27.63	22.95	0.2037%
Q5_K	23.84	22.04	0.2923%
IQ4_NL	19.31	27.70	1.1076%
MXFP4_MOE	17.90	20.46	2.7338%
Q4_K_M	20.27	26.65	2.9161%

Table - PPL Columns

model_name	gen	gen_er	code	code_er	math	math_er
BF16	6.8872	0.1679	1.4128	0.0095	5.4442	0.1209
Q8_0	6.8866	0.1679	1.4130	0.0095	5.4474	0.1210
Q6_K	6.9012	0.1685	1.4135	0.0095	5.4637	0.1218
Q5_K	6.9056	0.1685	1.4169	0.0096	5.4616	0.1213
IQ4_NL	6.9599	0.1703	1.4235	0.0097	5.5264	0.1235
MXFP4_MOE	7.1007	0.1728	1.4351	0.0097	5.6360	0.1239
Q4_K_M	7.0970	0.1760	1.4235	0.0098	5.7134	0.1305

Table - Precision Loss Columns

model_name	loss_general	loss_code	loss_math
BF16	0.0000	0.0000	0.0000
Q8_0	0.0087	0.0142	0.0588
Q6_K	0.2033	0.0495	0.3582
Q5_K	0.2672	0.2902	0.3196
IQ4_NL	1.0556	0.7574	1.5099
MXFP4_MOE	3.1000	1.5784	3.5230
Q4_K_M	3.0462	0.7574	4.9447

Support

I’m a solo developer working full time for myself to achieve my dream, pouring nights and weekends into open protocols and tools that I hope make the world a little better. If you chip in, you're helping me keep the lights on while I keep shipping.

Click here to see ways to support - BTC, Paypal, GitHub sponsors.

Or, just drop a like on the repo :)