| license: mit | |
| language: | |
| - en | |
| - fr | |
| - it | |
| - de | |
| - es | |
| library_name: transformers | |
| tags: | |
| - mixtral | |
| - text-generation-inference | |
| Attention quantization: HQQ 4-bit, groupsize 64, compress zero, compress scale with groupsize 256 \ | |
| Experts quantization: HQQ 3-bit, groupsize 64, compress zero, compress scale with groupsize 128 |