TheBloke
/

WizardLM-13B-V1.1-GGML

Model card Files Files and versions

TheBloke commited on Jul 9, 2023

Commit

55337af

·

1 Parent(s): 642f9c2

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -28,6 +28,12 @@ GGML files are for CPU + GPU inference using [llama.cpp](https://github.com/gger
 * [llama-cpp-python](https://github.com/abetlen/llama-cpp-python)
 * [ctransformers](https://github.com/marella/ctransformers)
 ## Repositories available
 * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/WizardLM-13B-V1.1-GPTQ)

 * [llama-cpp-python](https://github.com/abetlen/llama-cpp-python)
 * [ctransformers](https://github.com/marella/ctransformers)
+## Update 9th July 2023: GGML k-quants now available
+Thanks to the work of LostRuins/concedo, it is now possible to provide 100% working GGML k-quants for models like this which have a non-standard vocab size (32,001).
+k-quants have been uploaded and will work with all llama.cpp clients without any changes required.
 ## Repositories available
 * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/WizardLM-13B-V1.1-GPTQ)