Upload README.md
Browse files
README.md
CHANGED
|
@@ -26,7 +26,7 @@ model-index:
|
|
| 26 |
**The first 8-bit VibeVoice model that actually works**
|
| 27 |
|
| 28 |
[](LICENSE)
|
| 29 |
-
[](https://huggingface.co/FabioSarracino/VibeVoice-Large-Q8)
|
| 31 |
|
| 32 |
[π€ Model](https://huggingface.co/FabioSarracino/VibeVoice-Large-Q8) β’ [π» ComfyUI](https://github.com/Enemyx-net/VibeVoice-ComfyUI) β’ [π Docs](https://github.com/Enemyx-net/VibeVoice-ComfyUI/blob/main/README.md)
|
|
@@ -43,7 +43,7 @@ The secret? **Selective quantization**: I only quantized the language model (the
|
|
| 43 |
|
| 44 |
### Results
|
| 45 |
- β
Perfect audio, identical to the original model
|
| 46 |
-
- β
|
| 47 |
- β
Uses ~12 GB VRAM instead of 20 GB
|
| 48 |
- β
Works on 12 GB GPUs (RTX 3060, 4070 Ti, etc.)
|
| 49 |
|
|
@@ -68,11 +68,11 @@ I only quantized what can be safely quantized without losing quality.
|
|
| 68 |
|
| 69 |
| Model | Size | Audio Quality | Status |
|
| 70 |
|-------|------|---------------|--------|
|
| 71 |
-
| Original VibeVoice |
|
| 72 |
-
| Other 8-bit models |
|
| 73 |
-
| **This model** | **
|
| 74 |
|
| 75 |
-
+0
|
| 76 |
|
| 77 |
---
|
| 78 |
|
|
@@ -157,11 +157,11 @@ wavfile.write("output.wav", 24000, audio)
|
|
| 157 |
- You need a production-ready model
|
| 158 |
- You want the best size/quality balance
|
| 159 |
|
| 160 |
-
### Use full precision (
|
| 161 |
- You have unlimited VRAM (24+ GB)
|
| 162 |
- You're doing research requiring absolute precision
|
| 163 |
|
| 164 |
-
### Use 4-bit NF4 (~6 GB) if:
|
| 165 |
- You only have 8-10 GB VRAM
|
| 166 |
- You can accept a small quality trade-off
|
| 167 |
|
|
|
|
| 26 |
**The first 8-bit VibeVoice model that actually works**
|
| 27 |
|
| 28 |
[](LICENSE)
|
| 29 |
+
[](https://huggingface.co/FabioSarracino/VibeVoice-Large-Q8)
|
| 30 |
[](https://huggingface.co/FabioSarracino/VibeVoice-Large-Q8)
|
| 31 |
|
| 32 |
[π€ Model](https://huggingface.co/FabioSarracino/VibeVoice-Large-Q8) β’ [π» ComfyUI](https://github.com/Enemyx-net/VibeVoice-ComfyUI) β’ [π Docs](https://github.com/Enemyx-net/VibeVoice-ComfyUI/blob/main/README.md)
|
|
|
|
| 43 |
|
| 44 |
### Results
|
| 45 |
- β
Perfect audio, identical to the original model
|
| 46 |
+
- β
11.6 GB instead of 18.7 GB (-38%)
|
| 47 |
- β
Uses ~12 GB VRAM instead of 20 GB
|
| 48 |
- β
Works on 12 GB GPUs (RTX 3060, 4070 Ti, etc.)
|
| 49 |
|
|
|
|
| 68 |
|
| 69 |
| Model | Size | Audio Quality | Status |
|
| 70 |
|-------|------|---------------|--------|
|
| 71 |
+
| Original VibeVoice | 18.7 GB | βββββ | Full precision |
|
| 72 |
+
| Other 8-bit models | 10.6 GB | π₯ NOISE | β Don't work |
|
| 73 |
+
| **This model** | **11.6 GB** | βββββ | β
**Perfect** |
|
| 74 |
|
| 75 |
+
+1.0 GB vs other 8-bit models = perfect audio instead of noise. Worth it.
|
| 76 |
|
| 77 |
---
|
| 78 |
|
|
|
|
| 157 |
- You need a production-ready model
|
| 158 |
- You want the best size/quality balance
|
| 159 |
|
| 160 |
+
### Use full precision (18.7 GB) if:
|
| 161 |
- You have unlimited VRAM (24+ GB)
|
| 162 |
- You're doing research requiring absolute precision
|
| 163 |
|
| 164 |
+
### Use 4-bit NF4 (~6.6 GB) if:
|
| 165 |
- You only have 8-10 GB VRAM
|
| 166 |
- You can accept a small quality trade-off
|
| 167 |
|