|
|
--- |
|
|
base_model: |
|
|
- black-forest-labs/FLUX.1-dev |
|
|
pipeline_tag: text-to-image |
|
|
tags: |
|
|
- gguf |
|
|
- flux |
|
|
- text-to-image |
|
|
- imatrix |
|
|
--- |
|
|
|
|
|
# Supported? |
|
|
|
|
|
Expect broken or faulty items for the time being. Use at your own discretion. |
|
|
|
|
|
- ComfyUI-GGUF: all? (CPU/CUDA) |
|
|
- Fast dequant: BF16, Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, Q6_K, Q5_K, Q4_K, Q3_K, Q2_K |
|
|
- Slow dequant: others [via GGUF/NumPy](https://github.com/city96/ComfyUI-GGUF/blob/379175e7bf8b65019cdd11108bb882120a6f17df/dequant.py#L24-L28) |
|
|
- Forge: TBC |
|
|
- stable-diffusion.cpp: [llama.cpp Feature-matrix](https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix) |
|
|
- CPU: all |
|
|
- Cuda: all? |
|
|
- Vulkan: >= Q3_K_S, > IQ4_S; [PR IQ1_S, IQ1_M](https://github.com/ggerganov/llama.cpp/pull/11528) [PR IQ4_XS](https://github.com/ggerganov/llama.cpp/pull/11501) |
|
|
- other: ? |
|
|
|
|
|
# Bravo |
|
|
Combined imatrix multiple images 512x512 25 and 50 steps [city96/flux1-dev-Q8_0](https://huggingface.co/city96/FLUX.1-dev-gguf/blob/main/flux1-dev-Q8_0.gguf) euler |
|
|
|
|
|
Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch). |
|
|
|
|
|
## Experimental from f16 |
|
|
|
|
|
| Filename | Quant type | File Size | Description | Example Image | |
|
|
| -------- | ---------- | --------- | ----------- | ------------- | |
|
|
| [flux1-dev-IQ1_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ1_S.gguf) | IQ1_S | 2.45GB | bad quality | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ1_S_512_25_woman.png) | |
|
|
| [flux1-dev-IQ1_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ1_M.gguf) | IQ1_M | 2.72GB | bad quality | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ1_M_512_25_woman.png) | |
|
|
| [flux1-dev-IQ2_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_XXS.gguf) | IQ2_XXS | 3.19GB | bad quality | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ2_XXS_512_25_woman.png) | |
|
|
| [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_XS.gguf) | IQ2_XS | 3.56GB | TBC | - | |
|
|
| [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_S.gguf) | IQ2_S | 3.56GB | TBC | - | |
|
|
| [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ2_M.gguf) | IQ2_M | 3.93GB | bad quality | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ2_M_512_25_woman.png) | |
|
|
| [flux1-dev-Q2_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q2_K_S.gguf) | Q2_K_S | 4.02GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_Q2_K_S_512_25_woman.png) | |
|
|
| [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ3_XXS.gguf) | IQ3_XXS | 4.66GB | - | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ3_XXS_512_25_woman.png) | |
|
|
| [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ3_XS.gguf) | IQ3_XS | 5.22GB | worse than IQ3_XXS | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ3_XS_512_25_woman.png) | |
|
|
| [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.22GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_Q3_K_S_512_25_woman.png) | |
|
|
| [flux1-dev-IQ4_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ4_XS.gguf) | IQ4_XS | 6.42GB | TBC | - | |
|
|
| [flux1-dev-Q4_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q4_0.gguf) | Q4_0 | 6.79GB | TBC | - | |
|
|
| [flux1-dev-IQ4_NL.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-IQ4_NL.gguf) | IQ4_NL | 6.79GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_IQ4_NL_512_25_woman.png) | |
|
|
| [flux1-dev-Q4_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q4_K_S.gguf) | Q4_K_S | 6.79GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_Q4_K_S_512_25_woman.png) | |
|
|
| [flux1-dev-Q4_1.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q4_1.gguf) | Q4_1 | 7.53GB | TBC | - | |
|
|
| [flux1-dev-Q5_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/flux1-dev-Q5_K_S.gguf) | Q5_K_S | 8.27GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-f16-combined/images/output_test_comb_Q5_K_S_512_25_woman.png) | |
|
|
|
|
|
## Observations |
|
|
|
|
|
- Bravo IQ1_S worse than Alpha? |
|
|
- [Latent loss](https://huggingface.co/Eviation/flux-imatrix/blob/main/images/latent_loss.png) |
|
|
- [Per layer quantization cost](https://huggingface.co/Eviation/flux-imatrix/blob/main/images/casting_cost.png) from [chrisgoringe/casting_cost](https://github.com/chrisgoringe/mixed-gguf-converter/blob/main/costs/casting_cost.yaml) |
|
|
- [Ablation latent loss per weight type](https://huggingface.co/Eviation/flux-imatrix/blob/main/images/latent_loss_ablation.png) |
|
|
|
|
|
# Alpha |
|
|
Simple imatrix: 512x512 single image 8/20 steps [city96/flux1-dev-Q3_K_S](https://huggingface.co/city96/FLUX.1-dev-gguf/blob/main/flux1-dev-Q3_K_S.gguf) euler |
|
|
|
|
|
data: `load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks`. |
|
|
|
|
|
Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch). |
|
|
|
|
|
## Experimental from q8 |
|
|
|
|
|
| Filename | Quant type | File Size | Description | Example Image | |
|
|
| -------- | ---------- | --------- | ----------- | ------------- | |
|
|
| [flux1-dev-IQ1_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ1_S.gguf) | IQ1_S | 2.45GB | obviously bad quality | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ1_S_512_25_woman.png) | |
|
|
| - | IQ1_M | - | broken | - | |
|
|
| [flux1-dev-TQ1_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-TQ1_0.gguf) | TQ1_0| 2.63GB | TBC | - | |
|
|
| [flux1-dev-TQ2_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-TQ2_0.gguf) | TQ2_0 | 3.19GB | TBC | - | |
|
|
| [flux1-dev-IQ2_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_XXS.gguf) | IQ2_XXS | 3.19GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ2_XXS_512_25_woman.png) | |
|
|
| [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_XS.gguf) | IQ2_XS | 3.56GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ2_XS_512_25_woman.png) | |
|
|
| [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_S.gguf) | IQ2_S | 3.56GB | TBC | - | |
|
|
| [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_M.gguf) | IQ2_M | 3.93GB | TBC | - | |
|
|
| [flux1-dev-Q2_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K.gguf) | Q2_K | 4.02GB | TBC | - | |
|
|
| [flux1-dev-Q2_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K_S.gguf) | Q2_K_S | 4.02GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q2_K_S_512_25_woman.png) | |
|
|
| [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XXS.gguf) | IQ3_XXS | 4.66GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ3_XXS_512_25_woman.png) | |
|
|
| [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XS.gguf) | IQ3_XS | 5.22GB | TBC | - | |
|
|
| [flux1-dev-IQ3_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_S.gguf) | IQ3_S | 5.22GB | TBC | - | |
|
|
| [flux1-dev-IQ3_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_M.gguf) | IQ3_M | 5.22GB | TBC | - | |
|
|
| [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.22GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q3_K_S_512_25_woman.png) | |
|
|
| [flux1-dev-Q3_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_K.gguf) | Q3_K_M | 5.36GB | TBC | - | |
|
|
| [flux1-dev-Q3_K_L.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_L.gguf) | Q3_K_L | 5.36GB | TBC | - | |
|
|
| [flux1-dev-IQ4_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ4_XS.gguf) | IQ4_XS | 6.42GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ4_XS_512_25_woman.png) | |
|
|
| [flux1-dev-IQ4_NL.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ4_NL.gguf) | IQ4_NL | 6.79GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_IQ4_NL_512_25_woman.png) | |
|
|
| [flux1-dev-Q4_0.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_0.gguf) | Q4_0 | 6.79GB | TBC | - | |
|
|
| - | Q4_K | TBC | TBC | - | |
|
|
| [flux1-dev-Q4_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_K_S.gguf) | Q4_K_S | 6.79GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q4_K_S_512_25_woman.png) | |
|
|
| [flux1-dev-Q4_K_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_K_M.gguf) | Q4_K_M | 6.93GB | TBC | - | |
|
|
| [flux1-dev-Q4_1.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q4_1.gguf) | Q4_1 | 7.53GB | TBC | - | |
|
|
| [flux1-dev-Q5_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q5_K_S.gguf) | Q5_K_S | 8.27GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q5_K_S_512_25_woman.png) | |
|
|
| [flux1-dev-Q5_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q5_K.gguf) | Q5_K | 8.41GB | TBC | - | |
|
|
| - | Q5_K_M | TBC | TBC | - | |
|
|
| [flux1-dev-Q6_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q6_K.gguf) | Q6_K | 9.84GB | TBC | - | |
|
|
| - | Q8_0 | 12.7GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_Q8_512_25_woman.png) | |
|
|
| - | F16 | 23.8GB | TBC | [Example](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/images/output_test_F16_512_25_woman.png) | |
|
|
|
|
|
## Observations |
|
|
|
|
|
Sub-quants not diferentiated as expected: IQ2_XS == IQ2_S, IQ3_XS == IQ3_S == IQ3_M, Q3_K_M == Q3_K_L. |
|
|
- Check if [lcpp_sd3.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp_sd3.patch) includes more specific quant level logic |
|
|
- Extrapolate the existing level logic |
|
|
|
|
|
| Quant type | High level quants | Middle level quants | Low level quant | Average | |
|
|
| ---------- | ----- | ----- | --------- | ---- | |
|
|
| IQ1_S | 5.5% 16bpw | - | 94.5% 1.5625bpw | 2.3556bpw | |
|
|
| IQ2_XXS | 4.2% 16bpw | - | 95.8% 2.0625bpw | 2.6504bpw | |
|
|
| IQ2_XS | 3.8% 16bpw | - | 96.2% 2.3125bpw | 2.8297bpw | |
|
|
| IQ2_S | 3.8% 16bpw | - | 96.2% 2.3125bpw | 2.8298bpw | |
|
|
| IQ2_M | 3.4% 16bpw | - | 96.6% 2.5625bpw | 3.0224bpw | |
|
|
| Q2_K_S | 3.3% 16bpw | - | 96.7% 2.625bpw | 3.0723bpw | |
|
|
| IQ3_XXS | 2.9% 16bpw | - | 97.1% 3.0625bpw | 3.4351bpw | |
|
|
| IQ3_XS | 2.6% 16bpw | - | 97.4% 3.4375bpw | 3.7609bpw | |
|
|
| IQ3_S | 2.6% 16bpw | - | 97.4% 3.4375bpw | 3.7609bpw | |
|
|
| IQ3_M | 2.6% 16bpw | - | 97.4% 3.4375bpw | 3.7609bpw | |
|
|
|
|
|
|