flux-imatrix / README.md
Eviation's picture
Update README.md
a7949c8 verified
|
raw
history blame
29.8 kB
---
base_model:
- black-forest-labs/FLUX.1-dev
pipeline_tag: text-to-image
library_name: gguf
license: other
license_name: flux-1-dev-non-commercial-license
tags:
- gguf
- flux
- text-to-image
- imatrix
---
# Supported?
Expect broken or faulty items for the time being. Use at your own discretion.
- ComfyUI-GGUF: all? (CPU/CUDA)
- Fast dequant: BF16, Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, Q6_K, Q5_K, Q4_K, Q3_K, Q2_K
- Slow dequant: others [via GGUF/NumPy](https://github.com/city96/ComfyUI-GGUF/blob/379175e7bf8b65019cdd11108bb882120a6f17df/dequant.py#L24-L28)
- Forge: TBC
- stable-diffusion.cpp: [llama.cpp Feature-matrix](https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix)
- CPU: all
- Cuda: all?
- Vulkan: >= Q3_K_S, > IQ4_S; [PR IQ1_S, IQ1_M](https://github.com/ggerganov/llama.cpp/pull/11528) [PR IQ4_XS](https://github.com/ggerganov/llama.cpp/pull/11501)
- other: ?
# Disco
Dynamic quantization:
- time_in.in_layer: Q8_0/Q6_K
- final_layer, vector_in.in_layer, guidance_in: Q8_0
- vector_in.out_layer, time_in.out_layer, txt_in, img_in: F16
- single_blocks.[> 10 && < 37].modulation.lin: one down?
| Filename | Quant type | File Size | Description / L2 Loss Step 25 | Example Image |
| -------- | ---------- | --------- | ----------------------------- | ------------- |
# Caesar
Combined imatrix multiple images 512x512 and 768x768, 25, 30 and 50 steps [city96/flux1-dev-Q8_0](https://huggingface.co/city96/FLUX.1-dev-gguf/blob/main/flux1-dev-Q8_0.gguf) euler
data: `load_imatrix: loaded 314 importance matrix entries from imatrix_caesar.dat computed on 475 chunks`
Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch).
Dynamic quantization:
- img_in, guidance_in.in_layer, final_layer.linear: f32/bf16/f16
- guidance_in, final_layer: bf16/f16
- img_attn.qkv, linear1: some layers two bits up
- txt_mod.lin, txt_mlp, txt_attn.proj: some layers one bit down
## Experimental from f16
| Filename | Quant type | File Size | Description / L2 Loss Step 25 | Example Image |
| -------- | ---------- | --------- | ----------------------------- | ------------- |
| [flux1-dev-IQ1_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ1_S.gguf) | IQ1_S | 2.41GB | worst / 173 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ1_S_512_25_woman.png) |
| [flux1-dev-TQ1_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-TQ1_0.gguf) | TQ1_0 | 2.64GB | worst / 195 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_TQ1_0_512_25_woman.png) |
| [flux1-dev-IQ1_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ1_M.gguf) | IQ1_M | 2.72GB | worst / 171 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ1_M_512_25_woman.png) |
| [flux1-dev-IQ2_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ2_XXS.gguf) | IQ2_XXS | 3.10GB | worst * / 126 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ2_XXS_512_25_woman.png) |
| [flux1-dev-TQ2_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-TQ2_0.gguf) | TQ2_0 | 3.12GB | worst / 202 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_TQ2_0_512_25_woman.png) |
| [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ2_XS.gguf) | IQ2_XS | 3.48GB | worst / 140 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ2_XS_512_25_woman.png) |
| [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ2_S.gguf) | IQ2_S | 3.51GB | worst / 142 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ2_S_512_25_woman.png) |
| [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ2_M.gguf) | IQ2_M | 3.84GB | bad / 120 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ2_M_512_25_woman.png) |
| [flux1-dev-Q2_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q2_K_S.gguf) | Q2_K_S | 4.00GB | ok * / 52 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q2_K_S_512_25_woman.png) |
| [flux1-dev-Q2_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q2_K.gguf) | Q2_K | 4.03GB | ok / 55 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q2_K_512_25_woman.png) |
| [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ3_XXS.gguf) | IQ3_XXS | 4.56GB | ok / 92 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ3_XXS_512_25_woman.png) |
| [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ3_XS.gguf) | IQ3_XS | 5.05GB | bad / 125 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ3_XS_512_25_woman.png) |
| [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.10GB | ok / 48 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q3_K_S_512_25_woman.png) |
| [flux1-dev-IQ3_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ3_S.gguf) | IQ3_S | 5.11GB | bad / 123 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ3_S_512_25_woman.png) |
| [flux1-dev-Q3_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q3_K_M.gguf) | Q3_K_M | 5.13GB | ok / 50 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q3_K_M_512_25_woman.png) |
| [flux1-dev-IQ3_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ3_M.gguf) | IQ3_M | 5.14GB | bad / 123 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ3_M_512_25_woman.png) |
| [flux1-dev-Q3_K_L.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q3_K_L.gguf) | Q3_K_L | 5.17GB | ok / 61 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q3_K_L_512_25_woman.png) |
| [flux1-dev-IQ4_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ4_XS.gguf) | IQ4_XS | 6.33GB | good / 33 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ4_XS_512_25_woman.png) |
| [flux1-dev-Q4_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q4_K_S.gguf) | Q4_K_S | 6.66GB | good / 22 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q4_K_S_512_25_woman.png) |
| [flux1-dev-Q4_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q4_K_M.gguf) | Q4_K_M | 6.69GB | good / 21 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q4_K_M_512_25_woman.png) |
| [flux1-dev-IQ4_NL.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ4_NL.gguf) | IQ4_NL | 6.69GB | good / 24 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_IQ4_NL_512_25_woman.png) |
| [flux1-dev-Q4_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q4_0.gguf) | Q4_0 | 6.81GB | good / 30 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q4_0_512_25_woman.png) |
| [flux1-dev-Q4_1.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q4_1.gguf) | Q4_1 | 7.55GB | good / 27 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q4_1_512_25_woman.png) |
| [flux1-dev-Q5_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q5_K_S.gguf) | Q5_K_S | 8.26GB | nice / 21 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q5_K_S_512_25_woman.png) |
| [flux1-dev-Q5_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q5_0.gguf) | Q5_0 | 8.27GB | good / 30 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q5_0_512_25_woman.png) |
| [flux1-dev-Q5_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q5_K_M.gguf) | Q5_K_M | 8.30GB | nice / 23 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q5_K_M_512_25_woman.png) |
| [flux1-dev-Q5_1.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q5_1.gguf) | Q5_1 | 8.99GB | nice * / 14 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q5_1_512_25_woman.png) |
| [flux1-dev-Q6_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q6_K.gguf) | Q6_K | 9.80GB | nice / 20 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q6_K_512_25_woman.png) |
| [flux1-dev-Q8_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q8_0.gguf) | Q8_0 | 12.3GB | near perfect * / 8 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/images/output_test_caesar_Q8_0_512_25_woman.png) |
| - | F16 | 23.8GB | reference | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_F16_512_25_woman.png) |
| Filename | Bits img_attn.qkv & linear1 |
| -------- | --------------------------- |
| [flux1-dev-IQ1_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ1_S.gguf) | 333M MMMM M111 ... 11MM MM11 |
| [flux1-dev-TQ1_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-TQ1_0.gguf) | 3332 2222 2111 ... 1122 2211 |
| [flux1-dev-IQ1_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ1_M.gguf) | 3332 2222 2111 ... 1122 2211 |
| [flux1-dev-IQ2_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ2_XXS.gguf) | 4433 3333 3222 ... 2222 |
| [flux1-dev-TQ2_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-TQ2_0.gguf) | 3332 2222 2111 ... 1122 2211 |
| [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ2_XS.gguf) | 4443 3333 3222 ... 2233 3322 |
| [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ2_S.gguf) | 4444 4444 4444 4444 4433 3222 ... 2233 3322 |
| [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ2_M.gguf) | 4444 4444 4444 4444 4433 3222 ... 2223 3333 3322 |
| [flux1-dev-Q2_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q2_K_S.gguf) | 4443 3333 3222 ... 2222 |
| [flux1-dev-Q2_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q2_K.gguf) | 4443 3333 3222 ... 2233 3322 |
| [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ3_XXS.gguf) | 444S SSSS S333 ... 3333 |
| [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ3_XS.gguf) | 444S SSSS S333 ... 33SS SS33 |
| [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q3_K_S.gguf) | 5554 4444 4333 ... 3333 |
| [flux1-dev-IQ3_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ3_S.gguf) | 5554 4444 4333 ... 3344 4433 |
| [flux1-dev-Q3_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q3_K_M.gguf) | 5554 4444 4333 ... 3344 4433 |
| [flux1-dev-IQ3_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ3_M.gguf) | 5554 4444 4444 4444 4433 ... 3344 4433 |
| [flux1-dev-Q3_K_L.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q3_K_L.gguf) | 5554 4444 4444 4444 4433 ... 3344 4433 |
| [flux1-dev-IQ4_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ4_XS.gguf) | 8885 5555 5444 ... 4444 |
| [flux1-dev-Q4_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q4_K_S.gguf) | 8885 5555 5444 ... 4444 |
| [flux1-dev-Q4_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q4_K_M.gguf) | 8885 5555 5555 5555 5544 ... 4444 |
| [flux1-dev-IQ4_NL.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-IQ4_NL.gguf) | 8885 5555 5555 5555 5544 ... 4444 |
| [flux1-dev-Q4_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q4_0.gguf) | 8885 5555 5444 ... 4444 |
| [flux1-dev-Q4_1.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q4_1.gguf) | 8885 5555 5444 ... 4444 |
| [flux1-dev-Q5_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q5_K_S.gguf) | FFF6 6666 6666 6666 6655 ... 5555 |
| [flux1-dev-Q5_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q5_0.gguf) | FFF8 8888 8555 ... 5555 |
| [flux1-dev-Q5_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q5_K_M.gguf) | FFF8 8888 8666 6666 6655 ... 5555 |
| [flux1-dev-Q5_1.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q5_1.gguf) | FFF8 8888 8555 ... 5555 |
| [flux1-dev-Q6_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q6_K.gguf) | FFF8 8888 8666 .. 6666 |
| [flux1-dev-Q8_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-caesar/flux1-dev-Q8_0.gguf) | FFF8 8888 .. 8888 |
## Observations
- More imatrix data doesn't necessarily result in better quants
- I-quants worse than same bits k-quants?
- [Quant-dequant loss](https://huggingface.co/Eviation/flux-imatrix/blob/main/images/loss_quants.png)
# Bravo
Combined imatrix multiple images 512x512 25 and 50 steps [city96/flux1-dev-Q8_0](https://huggingface.co/city96/FLUX.1-dev-gguf/blob/main/flux1-dev-Q8_0.gguf) euler
Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch).
## Experimental from f16
| Filename | Quant type | File Size | Description / L2 Loss Step 25 | Example Image |
| -------- | ---------- | --------- | ----------------------------- | ------------- |
| [flux1-dev-IQ1_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ1_S.gguf) | IQ1_S | 2.45GB | worst / 156 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ1_S_512_25_woman.png) |
| [flux1-dev-IQ1_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ1_M.gguf) | IQ1_M | 2.72GB | worst / 141 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ1_M_512_25_woman.png) |
| [flux1-dev-IQ2_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_XXS.gguf) | IQ2_XXS | 3.19GB | worst / 131 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ2_XXS_512_25_woman.png) |
| [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_XS.gguf) | IQ2_XS | 3.56GB | worst / 125 | - |
| [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_S.gguf) | IQ2_S | 3.56GB | worst / 125 | - |
| [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_M.gguf) | IQ2_M | 3.93GB | worst / 120 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ2_M_512_25_woman.png) |
| [flux1-dev-Q2_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q2_K_S.gguf) | Q2_K_S | 4.02GB | ok / 56 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_Q2_K_S_512_25_woman.png) |
| [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ3_XXS.gguf) | IQ3_XXS | 4.66GB | TBC / 68 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ3_XXS_512_25_woman.png) |
| [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ3_XS.gguf) | IQ3_XS | 5.22GB | worse than IQ3_XXS / 115 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ3_XS_512_25_woman.png) |
| flux1-dev-IQ3_S.gguf | IQ3_S | TBC | TBC | - |
| flux1-dev-IQ3_M.gguf | IQ3_M | TBC | TBC | - |
| [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.22GB | TBC / 34 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_Q3_K_S_512_25_woman.png) |
| [flux1-dev-IQ4_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ4_XS.gguf) | IQ4_XS | 6.42GB | TBC / 25 | - |
| [flux1-dev-Q4_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q4_0.gguf) | Q4_0 | 6.79GB | TBC / 31 | - |
| [flux1-dev-IQ4_NL.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ4_NL.gguf) | IQ4_NL | 6.79GB | TBC / 21 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ4_NL_512_25_woman.png) |
| [flux1-dev-Q4_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q4_K_S.gguf) | Q4_K_S | 6.79GB | TBC / 29 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_Q4_K_S_512_25_woman.png) |
| [flux1-dev-Q4_1.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q4_1.gguf) | Q4_1 | 7.53GB | TBC / 24 | - |
| [flux1-dev-Q5_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q5_0.gguf) | Q5_0 | 8.27GB | TBC / 25 | - |
| flux1-dev-Q5_1.gguf | Q5_1 | TBC | TBC / 24 | - |
| [flux1-dev-Q5_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q5_K_S.gguf) | Q5_K_S | 8.27GB | TBC / 20 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_Q5_K_S_512_25_woman.png) |
| [flux1-dev-Q6_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q6_K.gguf) | Q6_K | 9.84GB | TBC / 19 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_Q6_K_512_25_woman.png) |
| flux1-dev-Q8_0.gguf | Q8_0 | - | TBC / 10 | - |
| - | F16 | 23.8GB | reference | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_F16_512_25_woman.png) |
## Observations
- Bravo IQ1_S worse than Alpha?
- [Latent loss](https://huggingface.co/Eviation/flux-imatrix/blob/main/images/latent_loss.png)
- [Per layer quantization cost](https://huggingface.co/Eviation/flux-imatrix/blob/main/images/casting_cost.png) from [chrisgoringe/casting_cost](https://github.com/chrisgoringe/mixed-gguf-converter/blob/main/costs/casting_cost.yaml)
- Per layer quantization cost 2 from [Freepik/flux.1-lite-8B](https://huggingface.co/Freepik/flux.1-lite-8B): [double blocks](https://huggingface.co/Freepik/flux.1-lite-8B/blob/main/sample_images/mse_mmdit_img.png) and [single blocks](https://huggingface.co/Freepik/flux.1-lite-8B/blob/main/sample_images/mse_dit_img.png)
- [Ablation latent loss per weight type](https://huggingface.co/Eviation/flux-imatrix/blob/main/images/latent_loss_ablation.png)
- [Pareto front loss vs. size](https://huggingface.co/Eviation/flux-imatrix/blob/main/images/latent_loss_size_pareto.png)
# Alpha
Simple imatrix: 512x512 single image 8/20 steps [city96/flux1-dev-Q3_K_S](https://huggingface.co/city96/FLUX.1-dev-gguf/blob/main/flux1-dev-Q3_K_S.gguf) euler
data: `load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks`.
Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch).
## Experimental from q8
| Filename | Quant type | File Size | Description / L2 Loss Step 25 | Example Image |
| -------- | ---------- | --------- | ----------------------------- | ------------- |
| [flux1-dev-IQ1_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ1_S.gguf) | IQ1_S | 2.45GB | worst / 152 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ1_S_512_25_woman.png) |
| - | IQ1_M | - | broken | - |
| [flux1-dev-TQ1_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-TQ1_0.gguf) | TQ1_0| 2.63GB | TBC / 220 | - |
| [flux1-dev-TQ2_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-TQ2_0.gguf) | TQ2_0 | 3.19GB | TBC / 220 | - |
| [flux1-dev-IQ2_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_XXS.gguf) | IQ2_XXS | 3.19GB | worst / 130 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ2_XXS_512_25_woman.png) |
| [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_XS.gguf) | IQ2_XS | 3.56GB | worst / 129 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ2_XS_512_25_woman.png) |
| [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_S.gguf) | IQ2_S | 3.56GB | worst / 129 | - |
| [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_M.gguf) | IQ2_M | 3.93GB | worst / 121 | - |
| [flux1-dev-Q2_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K.gguf) | Q2_K | 4.02GB | TBC / 77 | - |
| [flux1-dev-Q2_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K_S.gguf) | Q2_K_S | 4.02GB | ok / 77 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q2_K_S_512_25_woman.png) |
| [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XXS.gguf) | IQ3_XXS | 4.66GB | TBC / 130 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ3_XXS_512_25_woman.png) |
| [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XS.gguf) | IQ3_XS | 5.22GB | TBC / 114 | - |
| [flux1-dev-IQ3_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_S.gguf) | IQ3_S | 5.22GB | TBC / 114 | - |
| [flux1-dev-IQ3_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_M.gguf) | IQ3_M | 5.22GB | TBC / 114 | - |
| [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.22GB | TBC / 36 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q3_K_S_512_25_woman.png) |
| [flux1-dev-Q3_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_K.gguf) | Q3_K_M | 5.36GB | TBC / 42 | - |
| [flux1-dev-Q3_K_L.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_L.gguf) | Q3_K_L | 5.36GB | TBC / 42 | - |
| [flux1-dev-IQ4_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ4_XS.gguf) | IQ4_XS | 6.42GB | TBC / 30 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ4_XS_512_25_woman.png) |
| [flux1-dev-IQ4_NL.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ4_NL.gguf) | IQ4_NL | 6.79GB | TBC / 23 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ4_NL_512_25_woman.png) |
| [flux1-dev-Q4_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_0.gguf) | Q4_0 | 6.79GB | TBC / 27 | - |
| - | Q4_K | TBC | TBC / 27 | - |
| [flux1-dev-Q4_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_K_S.gguf) | Q4_K_S | 6.79GB | TBC / 26 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q4_K_S_512_25_woman.png) |
| [flux1-dev-Q4_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_K_M.gguf) | Q4_K_M | 6.93GB | TBC / 27 | - |
| [flux1-dev-Q4_1.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_1.gguf) | Q4_1 | 7.53GB | TBC / 23 | - |
| [flux1-dev-Q5_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q5_K_S.gguf) | Q5_K_S | 8.27GB | TBC / 19 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q5_K_S_512_25_woman.png) |
| [flux1-dev-Q5_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q5_K.gguf) | Q5_K | 8.41GB | TBC / 20 | - |
| - | Q5_K_M | TBC | TBC | - |
| [flux1-dev-Q6_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q6_K.gguf) | Q6_K | 9.84GB | TBC / 22 | - |
| - | Q8_0 | 12.7GB | near perfect / 10 | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q8_512_25_woman.png) |
| - | F16 | 23.8GB | reference | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_F16_512_25_woman.png) |
## Observations
Sub-quants not diferentiated as expected: IQ2_XS == IQ2_S, IQ3_XS == IQ3_S == IQ3_M, Q3_K_M == Q3_K_L.
- Check if [lcpp_sd3.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp_sd3.patch) includes more specific quant level logic
- Extrapolate the existing level logic