Update README.md
Browse files
README.md
CHANGED
|
@@ -27,7 +27,7 @@ Simple imatrix: 512x512 single image 8/20 steps [city96/flux1-dev-Q3_K_S](https:
|
|
| 27 |
|
| 28 |
data: `load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks`.
|
| 29 |
|
| 30 |
-
Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6).
|
| 31 |
|
| 32 |
## Experimental from q8
|
| 33 |
|
|
@@ -41,12 +41,27 @@ Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit
|
|
| 41 |
| [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_XS.gguf) | IQ2_XS | 3.56GB | TBC |
|
| 42 |
| [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_S.gguf) | IQ2_S | 3.56GB | TBC |
|
| 43 |
| [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_M.gguf) | IQ2_M | 3.93GB | TBC |
|
| 44 |
-
| [flux1-dev-IQ3_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_M.gguf) | IQ3_M | TBC | TBC |
|
| 45 |
| [flux1-dev-Q2_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K.gguf) | Q2_K | 4.02GB | TBC |
|
| 46 |
| [flux1-dev-Q2_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K_S.gguf) | Q2_K_S | 4.02GB | TBC |
|
| 47 |
| [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XXS.gguf) | IQ3_XXS | 4.66GB | TBC |
|
| 48 |
| [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XS.gguf) | IQ3_XS | 5.22GB | TBC |
|
| 49 |
| [flux1-dev-IQ3_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_S.gguf) | IQ3_S | 5.22GB | TBC |
|
| 50 |
-
| - |
|
| 51 |
| [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.22GB | TBC |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
|
|
|
|
| 27 |
|
| 28 |
data: `load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks`.
|
| 29 |
|
| 30 |
+
Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch).
|
| 31 |
|
| 32 |
## Experimental from q8
|
| 33 |
|
|
|
|
| 41 |
| [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_XS.gguf) | IQ2_XS | 3.56GB | TBC |
|
| 42 |
| [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_S.gguf) | IQ2_S | 3.56GB | TBC |
|
| 43 |
| [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_M.gguf) | IQ2_M | 3.93GB | TBC |
|
|
|
|
| 44 |
| [flux1-dev-Q2_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K.gguf) | Q2_K | 4.02GB | TBC |
|
| 45 |
| [flux1-dev-Q2_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K_S.gguf) | Q2_K_S | 4.02GB | TBC |
|
| 46 |
| [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XXS.gguf) | IQ3_XXS | 4.66GB | TBC |
|
| 47 |
| [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XS.gguf) | IQ3_XS | 5.22GB | TBC |
|
| 48 |
| [flux1-dev-IQ3_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_S.gguf) | IQ3_S | 5.22GB | TBC |
|
| 49 |
+
| [flux1-dev-IQ3_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_M.gguf) | IQ3_M | 5.22GB | TBC |
|
| 50 |
| [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.22GB | TBC |
|
| 51 |
+
| - | Q3_K_M | TBC | TBC |
|
| 52 |
+
| - | Q3_K_L | TBC | TBC |
|
| 53 |
+
| - | IQ4_XS | TBC | TBC |
|
| 54 |
+
| - | IQ4_NL | TBC | TBC |
|
| 55 |
+
| - | Q4_0 | TBC | TBC |
|
| 56 |
+
| - | Q4_K | TBC | TBC |
|
| 57 |
+
| - | Q4_K_S | TBC | TBC |
|
| 58 |
+
| - | Q4_K_M | TBC | TBC |
|
| 59 |
+
| - | Q5_K | TBC | TBC |
|
| 60 |
+
| [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.22GB | TBC |
|
| 61 |
+
|
| 62 |
+
## Observations
|
| 63 |
+
|
| 64 |
+
Sub-k-quants not diferentiated as expected: IQ2_XS == IQ2_S, IQ3_XS == IQ3_S == IQ3_M.
|
| 65 |
+
- Check if [lcpp_sd3.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp_sd3.patch) includes more specified quant logic
|
| 66 |
+
- Extrapolate the existing logic
|
| 67 |
|