Upload complete model
Browse files
README.md
CHANGED
|
@@ -10,7 +10,7 @@ library_name: mlx
|
|
| 10 |
### CURRENTLY UPLOADING
|
| 11 |
### CURRENTLY UPLOADING
|
| 12 |
|
| 13 |
-
**See Kimi-K2-Thinking 4.25bit MLX in action - [demonstration video
|
| 14 |
|
| 15 |
*q4.25bit quant perplexity TBA, but q4.5bit quant typically achieves 1.168 perplexity in our testing*
|
| 16 |
| Quantization | Perplexity |
|
|
@@ -30,7 +30,7 @@ library_name: mlx
|
|
| 30 |
* Memory usage: MBP ~80GB + Mac Studio ~450GB
|
| 31 |
* Expect ~22 tokens/s @ 1000 tokens
|
| 32 |
* Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.29
|
| 33 |
-
* For more details see [demonstration video
|
| 34 |
|
| 35 |
## Disclaimer
|
| 36 |
|
|
|
|
| 10 |
### CURRENTLY UPLOADING
|
| 11 |
### CURRENTLY UPLOADING
|
| 12 |
|
| 13 |
+
**See Kimi-K2-Thinking 4.25bit MLX in action - [demonstration video](https://youtu.be/GydlPnP7IYk)**
|
| 14 |
|
| 15 |
*q4.25bit quant perplexity TBA, but q4.5bit quant typically achieves 1.168 perplexity in our testing*
|
| 16 |
| Quantization | Perplexity |
|
|
|
|
| 30 |
* Memory usage: MBP ~80GB + Mac Studio ~450GB
|
| 31 |
* Expect ~22 tokens/s @ 1000 tokens
|
| 32 |
* Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.29
|
| 33 |
+
* For more details see [demonstration video](https://youtu.be/GydlPnP7IYk) or visit [Kimi-K2-Thinking](https://huggingface.co/moonshotai/Kimi-K2-Thinking).
|
| 34 |
|
| 35 |
## Disclaimer
|
| 36 |
|