Update README.md
Browse files
README.md
CHANGED
|
@@ -66,6 +66,8 @@ This model can be loaded with just over 10GB of VRAM (compared to the original 1
|
|
| 66 |
|
| 67 |
The 8 bit GPTQ quant has minimum quality degradation from the original `bfloat16` model due to its higher bitrate.
|
| 68 |
|
|
|
|
|
|
|
| 69 |
<!-- description end -->
|
| 70 |
|
| 71 |
## GPTQ Quantization Method
|
|
|
|
| 66 |
|
| 67 |
The 8 bit GPTQ quant has minimum quality degradation from the original `bfloat16` model due to its higher bitrate.
|
| 68 |
|
| 69 |
+
The `untrained-special-tokens-fixed` branch is the same model as the main branch but has special tokens and tokens untrained (by finding the tokens where max embedding value of each token in input_embeddings and output_embeddings is 0) and setting them to the average of all trained tokens for each feature. Using this branch is recommended if you plan to do any fine-tuning with your own tokens added or with instruction following.
|
| 70 |
+
|
| 71 |
<!-- description end -->
|
| 72 |
|
| 73 |
## GPTQ Quantization Method
|