Eviation commited on
Commit
948fb1f
·
verified ·
1 Parent(s): 029f98e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -3
README.md CHANGED
@@ -27,7 +27,7 @@ Simple imatrix: 512x512 single image 8/20 steps [city96/flux1-dev-Q3_K_S](https:
27
 
28
  data: `load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks`.
29
 
30
- Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6).
31
 
32
  ## Experimental from q8
33
 
@@ -41,12 +41,27 @@ Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit
41
  | [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_XS.gguf) | IQ2_XS | 3.56GB | TBC |
42
  | [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_S.gguf) | IQ2_S | 3.56GB | TBC |
43
  | [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_M.gguf) | IQ2_M | 3.93GB | TBC |
44
- | [flux1-dev-IQ3_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_M.gguf) | IQ3_M | TBC | TBC |
45
  | [flux1-dev-Q2_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K.gguf) | Q2_K | 4.02GB | TBC |
46
  | [flux1-dev-Q2_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K_S.gguf) | Q2_K_S | 4.02GB | TBC |
47
  | [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XXS.gguf) | IQ3_XXS | 4.66GB | TBC |
48
  | [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XS.gguf) | IQ3_XS | 5.22GB | TBC |
49
  | [flux1-dev-IQ3_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_S.gguf) | IQ3_S | 5.22GB | TBC |
50
- | - | Q3_K | TBC | TBC |
51
  | [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.22GB | TBC |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
 
27
 
28
  data: `load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks`.
29
 
30
+ Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch).
31
 
32
  ## Experimental from q8
33
 
 
41
  | [flux1-dev-IQ2_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_XS.gguf) | IQ2_XS | 3.56GB | TBC |
42
  | [flux1-dev-IQ2_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_S.gguf) | IQ2_S | 3.56GB | TBC |
43
  | [flux1-dev-IQ2_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ2_M.gguf) | IQ2_M | 3.93GB | TBC |
 
44
  | [flux1-dev-Q2_K.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K.gguf) | Q2_K | 4.02GB | TBC |
45
  | [flux1-dev-Q2_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q2_K_S.gguf) | Q2_K_S | 4.02GB | TBC |
46
  | [flux1-dev-IQ3_XXS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XXS.gguf) | IQ3_XXS | 4.66GB | TBC |
47
  | [flux1-dev-IQ3_XS.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_XS.gguf) | IQ3_XS | 5.22GB | TBC |
48
  | [flux1-dev-IQ3_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_S.gguf) | IQ3_S | 5.22GB | TBC |
49
+ | [flux1-dev-IQ3_M.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-IQ3_M.gguf) | IQ3_M | 5.22GB | TBC |
50
  | [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.22GB | TBC |
51
+ | - | Q3_K_M | TBC | TBC |
52
+ | - | Q3_K_L | TBC | TBC |
53
+ | - | IQ4_XS | TBC | TBC |
54
+ | - | IQ4_NL | TBC | TBC |
55
+ | - | Q4_0 | TBC | TBC |
56
+ | - | Q4_K | TBC | TBC |
57
+ | - | Q4_K_S | TBC | TBC |
58
+ | - | Q4_K_M | TBC | TBC |
59
+ | - | Q5_K | TBC | TBC |
60
+ | [flux1-dev-Q3_K_S.gguf](https://huggingface.co/Eviation/flux-imatrix/blob/main/experimental-from-q8/flux1-dev-Q3_K_S.gguf) | Q3_K_S | 5.22GB | TBC |
61
+
62
+ ## Observations
63
+
64
+ Sub-k-quants not diferentiated as expected: IQ2_XS == IQ2_S, IQ3_XS == IQ3_S == IQ3_M.
65
+ - Check if [lcpp_sd3.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp_sd3.patch) includes more specified quant logic
66
+ - Extrapolate the existing logic
67