Spaces:

Vishwas1
/

KittenTTSDemo

Runtime error

App Files Files Community

KittenTTSDemo / README.md

Vishwas1

Update README.md

be04c81 verified 3 months ago

preview code

raw

history blame

2.86 kB

	---
	title: KittenTTS - High Quality Text-to-Speech
	emoji: 🎤
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 5.41.1
	app_file: app_minimal.py
	pinned: false
	license: mit
	---

	# 🎤 KittenTTS - High Quality Text-to-Speech

	A Hugging Face Space showcasing the KittenTTS model for high-quality text-to-speech generation.

	## 🚀 Features

	- 8 Different Voices: 4 male and 4 female voices to choose from
	- High Quality Audio: 24kHz sample rate for crisp, clear speech
	- GPU-Free: Works without requiring a GPU
	- Easy-to-Use Interface: Simple and intuitive Gradio web interface
	- Real-time Generation: Fast speech synthesis with progress tracking

	## 🎵 Available Voices

	\| Voice ID \| Gender \| Description \|
	\|----------\|--------\|-------------\|
	\| `expr-voice-2-m` \| Male \| Male voice variant 2 \|
	\| `expr-voice-2-f` \| Female \| Female voice variant 2 \|
	\| `expr-voice-3-m` \| Male \| Male voice variant 3 \|
	\| `expr-voice-3-f` \| Female \| Female voice variant 3 \|
	\| `expr-voice-4-m` \| Male \| Male voice variant 4 \|
	\| `expr-voice-4-f` \| Female \| Female voice variant 4 \|
	\| `expr-voice-5-m` \| Male \| Male voice variant 5 \|
	\| `expr-voice-5-f` \| Female \| Female voice variant 5 \|

	## 🛠️ Usage

	1. Enter Text: Type or paste your text in the input box
	2. Select Voice: Choose from the dropdown menu of available voices
	3. Generate: Click the "Generate Speech" button or press Enter
	4. Download: Play the generated audio or download it

	## 💻 Technical Details

	- Model: [KittenML/kitten-tts-nano-0.1](https://huggingface.co/KittenML/kitten-tts-nano-0.1)
	- Sample Rate: 24kHz
	- Framework: KittenTTS
	- Interface: Gradio
	- Audio Format: WAV (24kHz, mono)

	## 🔧 Local Development

	To run this locally:

	```bash
	# Clone the repository
	git clone <your-repo-url>
	cd <your-repo-name>

	# Install dependencies
	pip install -r requirements.txt

	# Run the application
	python app.py
	```

	## 📦 Dependencies

	- `gradio>=4.0.0` - Web interface
	- `kittentts` - TTS framework
	- `soundfile` - Audio file handling
	- `numpy` - Numerical operations
	- `torch` - PyTorch backend
	- `torchaudio` - Audio processing
	- `transformers` - Hugging Face transformers
	- `accelerate` - Model acceleration

	## 🤝 Contributing

	Feel free to contribute by:
	- Reporting bugs
	- Suggesting new features
	- Improving the UI
	- Adding more voice options

	## 📄 License

	This project uses the KittenTTS model. Please refer to the original model's license for usage terms.

	## 🙏 Acknowledgments

	- [KittenML](https://huggingface.co/KittenML) for the TTS model
	- [Hugging Face](https://huggingface.co) for the Spaces platform
	- [Gradio](https://gradio.app) for the web interface framework

	---

	Note: This is a demonstration of the KittenTTS model. For production use, please ensure compliance with the model's license and terms of use.