Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Update README.md
Browse files
README.md
CHANGED
|
@@ -11,6 +11,7 @@ models:
|
|
| 11 |
- Pendrokar/xvapitch_expresso
|
| 12 |
- Pendrokar/TorchMoji
|
| 13 |
- Pendrokar/xvasynth_lojban
|
|
|
|
| 14 |
app_file: app.py
|
| 15 |
app_port: 7860
|
| 16 |
tags:
|
|
@@ -30,4 +31,23 @@ thumbnail: https://huggingface.co/spaces/Pendrokar/xVASynth/raw/main/thumbnail.p
|
|
| 30 |
short_description: CPU powered, low RTF, emotional, multilingual TTS
|
| 31 |
---
|
| 32 |
|
| 33 |
-
DanRuta's xVASynth, GitHub repo: [https://github.com/DanRuta/xVA-Synth](https://github.com/DanRuta/xVA-Synth)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
- Pendrokar/xvapitch_expresso
|
| 12 |
- Pendrokar/TorchMoji
|
| 13 |
- Pendrokar/xvasynth_lojban
|
| 14 |
+
- Pendrokar/xvasynth_cabal
|
| 15 |
app_file: app.py
|
| 16 |
app_port: 7860
|
| 17 |
tags:
|
|
|
|
| 31 |
short_description: CPU powered, low RTF, emotional, multilingual TTS
|
| 32 |
---
|
| 33 |
|
| 34 |
+
DanRuta's xVASynth, GitHub repo: [https://github.com/DanRuta/xVA-Synth](https://github.com/DanRuta/xVA-Synth)
|
| 35 |
+
|
| 36 |
+
Papers:
|
| 37 |
+
- VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech - https://arxiv.org/abs/2106.06103
|
| 38 |
+
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone - https://arxiv.org/abs/2112.02418
|
| 39 |
+
|
| 40 |
+
Referenced papers within code:
|
| 41 |
+
- Multi-head attention with Relative Positional embedding - https://arxiv.org/pdf/1809.04281.pdf
|
| 42 |
+
- Transformer with Relative Potional Encoding- https://arxiv.org/abs/1803.02155
|
| 43 |
+
- SDP - https://arxiv.org/pdf/2106.06103.pdf
|
| 44 |
+
- Spline Flow - https://arxiv.org/abs/1906.04032
|
| 45 |
+
|
| 46 |
+
Extra:
|
| 47 |
+
- DeepMoji - https://arxiv.org/abs/1708.00524
|
| 48 |
+
|
| 49 |
+
xVA FastPitch:
|
| 50 |
+
- [1] [FastPitch: Parallel Text-to-speech with Pitch Prediction](https://arxiv.org/abs/2006.06873)
|
| 51 |
+
- [2] [One TTS Alignment To Rule Them All](https://arxiv.org/abs/2108.10447)
|
| 52 |
+
|
| 53 |
+
Used datasets: Unknown/Non-permissiable data
|