File size: 10,396 Bytes
73daf53 fae1792 bfebeee 8ed7511 bfebeee 2e88dd6 bfebeee fb355db 7041fe5 fb355db 41e77fd aaf7427 ee0af77 bc8cd4e c56977a 87e15d1 c56977a 5b7ca4f 863fa8f fb904a3 9f41a78 78649b0 bfebeee f6a58d6 948fb1f bfebeee 73daf53 a23b889 d0b2a90 a23b889 1d6f9ba f9ed0b3 a23b889 b581775 a23b889 25df826 cd602e9 a23b889 25df826 aaf7427 a7e4e50 c2e655e aaf7427 c2e655e 21ba0f7 b54c279 948fb1f 3097ee2 bfebeee a406f16 8f65e1d a406f16 fd4c4bc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
---
base_model:
- black-forest-labs/FLUX.1-dev
pipeline_tag: text-to-image
tags:
- gguf
- flux
- text-to-image
- imatrix
---
# Supported?
Expect broken or faulty items for the time being. Use at your own discretion.
- ComfyUI-GGUF: all? (CPU/CUDA)
- Fast dequant: BF16, Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, Q6_K, Q5_K, Q4_K, Q3_K, Q2_K
- Slow dequant: others [via GGUF/NumPy](https://github.com/city96/ComfyUI-GGUF/blob/379175e7bf8b65019cdd11108bb882120a6f17df/dequant.py#L24-L28)
- Forge: TBC
- stable-diffusion.cpp: [llama.cpp Feature-matrix](https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix)
- CPU: all
- Cuda: all?
- Vulkan: >= Q3_K_S, > IQ4_S; [PR IQ1_S, IQ1_M](https://github.com/ggerganov/llama.cpp/pull/11528) [PR IQ4_XS](https://github.com/ggerganov/llama.cpp/pull/11501)
- other: ?
# Bravo
Combined imatrix multiple images 512x512 25 and 50 steps [city96/flux1-dev-Q8_0](https://huggingface.co/city96/FLUX.1-dev-gguf/blob/main/flux1-dev-Q8_0.gguf) euler
Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch).
## Experimental from f16
| Filename | Quant type | File Size | Description | Example Image |
| -------- | ---------- | --------- | ----------- | ------------- |
| [flux1-dev-IQ1_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ1_S.gguf) | IQ1_S | 2.45GB | bad quality | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ1_S_512_25_woman.png) |
| [flux1-dev-IQ1_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ1_M.gguf) | IQ1_M | 2.72GB | bad quality | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ1_M_512_25_woman.png) |
| [flux1-dev-IQ2_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_XXS.gguf) | IQ2_XXS | 3.19GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ2_XXS_512_25_woman.png) |
| [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_XS.gguf) | IQ2_XS | 3.56GB | TBC | - |
| [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_S.gguf) | IQ2_S | 3.56GB | TBC | - |
| [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_M.gguf) | IQ2_M | 3.93GB | TBC | - |
| [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ3_XXS.gguf) | IQ3_XXS | 4.66GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ3_XXS_512_25_woman.png) |
| [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ3_XS.gguf) | IQ3_XS | 5.22GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ3_XS_512_25_woman.png) |
## Observations
- Bravo IQ1_S worse than Alpha?
# Alpha
Simple imatrix: 512x512 single image 8/20 steps [city96/flux1-dev-Q3_K_S](https://huggingface.co/city96/FLUX.1-dev-gguf/blob/main/flux1-dev-Q3_K_S.gguf) euler
data: `load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks`.
Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch).
## Experimental from q8
| Filename | Quant type | File Size | Description | Example Image |
| -------- | ---------- | --------- | ----------- | ------------- |
| [flux1-dev-IQ1_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ1_S.gguf) | IQ1_S | 2.45GB | obviously bad quality | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ1_S_512_25_woman.png) |
| - | IQ1_M | - | broken | - |
| [flux1-dev-TQ1_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-TQ1_0.gguf) | TQ1_0| 2.63GB | TBC | - |
| [flux1-dev-TQ2_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-TQ2_0.gguf) | TQ2_0 | 3.19GB | TBC | - |
| [flux1-dev-IQ2_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_XXS.gguf) | IQ2_XXS | 3.19GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ2_XXS_512_25_woman.png) |
| [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_XS.gguf) | IQ2_XS | 3.56GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ2_XS_512_25_woman.png) |
| [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_S.gguf) | IQ2_S | 3.56GB | TBC | - |
| [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_M.gguf) | IQ2_M | 3.93GB | TBC | - |
| [flux1-dev-Q2_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K.gguf) | Q2_K | 4.02GB | TBC | - |
| [flux1-dev-Q2_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K_S.gguf) | Q2_K_S | 4.02GB | TBC | - |
| [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XXS.gguf) | IQ3_XXS | 4.66GB | TBC | - |
| [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XS.gguf) | IQ3_XS | 5.22GB | TBC | - |
| [flux1-dev-IQ3_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_S.gguf) | IQ3_S | 5.22GB | TBC | - |
| [flux1-dev-IQ3_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_M.gguf) | IQ3_M | 5.22GB | TBC | - |
| [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.22GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q3_K_S_512_25_woman.png) |
| [flux1-dev-Q3_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_K.gguf) | Q3_K_M | 5.36GB | TBC | - |
| [flux1-dev-Q3_K_L.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_L.gguf) | Q3_K_L | 5.36GB | TBC | - |
| [flux1-dev-IQ4_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ4_XS.gguf) | IQ4_XS | 6.42GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ4_XS_512_25_woman.png) |
| [flux1-dev-IQ4_NL.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ4_NL.gguf) | IQ4_NL | 6.79GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ4_NL_512_25_woman.png) |
| [flux1-dev-Q4_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_0.gguf) | Q4_0 | 6.79GB | TBC | - |
| - | Q4_K | TBC | TBC | - |
| [flux1-dev-Q4_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_K_S.gguf) | Q4_K_S | 6.79GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q4_K_S_512_25_woman.png) |
| [flux1-dev-Q4_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_K_M.gguf) | Q4_K_M | 6.93GB | TBC | - |
| [flux1-dev-Q4_1.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_1.gguf) | Q4_1 | 7.53GB | TBC | - |
| [flux1-dev-Q5_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q5_K_S.gguf) | Q5_K_S | 8.27GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q5_K_S_512_25_woman.png) |
| [flux1-dev-Q5_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q5_K.gguf) | Q5_K | 8.41GB | TBC | - |
| - | Q5_K_M | TBC | TBC | - |
| [flux1-dev-Q6_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q6_K.gguf) | Q6_K | 9.84GB | TBC | - |
| - | Q8_0 | 12.7GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q8_512_25_woman.png) |
| - | F16 | 23.8GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_F16_512_25_woman.png) |
## Observations
Sub-quants not diferentiated as expected: IQ2_XS == IQ2_S, IQ3_XS == IQ3_S == IQ3_M, Q3_K_M == Q3_K_L.
- Check if [lcpp_sd3.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp_sd3.patch) includes more specific quant level logic
- Extrapolate the existing level logic
| Quant type | High level quants | Middle level quants | Low level quant | Average |
| ---------- | ----- | ----- | --------- | ---- |
| IQ1_S | 5.5% 16bpw | - | 94.5% 1.5625bpw | 2.3556bpw |
| IQ2_XXS | 4.2% 16bpw | - | 95.8% 2.0625bpw | 2.6504bpw |
| IQ2_XS | 3.8% 16bpw | - | 96.2% 2.3125bpw | 2.8297bpw |
| IQ2_S | 3.8% 16bpw | - | 96.2% 2.3125bpw | 2.8298bpw |
| IQ2_M | 3.4% 16bpw | - | 96.6% 2.5625bpw | 3.0224bpw |
| Q2_K_S | 3.3% 16bpw | - | 96.7% 2.625bpw | 3.0723bpw |
| IQ3_XXS | 2.9% 16bpw | - | 97.1% 3.0625bpw | 3.4351bpw |
| IQ3_XS | 2.6% 16bpw | - | 97.4% 3.4375bpw | 3.7609bpw |
| IQ3_S | 2.6% 16bpw | - | 97.4% 3.4375bpw | 3.7609bpw |
| IQ3_M | 2.6% 16bpw | - | 97.4% 3.4375bpw | 3.7609bpw |
|