--- title: KittenTTS - High Quality Text-to-Speech emoji: 🎤 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 5.41.1 app_file: app_minimal.py pinned: false license: mit --- # 🎤 KittenTTS - High Quality Text-to-Speech A Hugging Face Space showcasing the KittenTTS model for high-quality text-to-speech generation. ## 🚀 Features - **8 Different Voices**: 4 male and 4 female voices to choose from - **High Quality Audio**: 24kHz sample rate for crisp, clear speech - **GPU-Free**: Works without requiring a GPU - **Easy-to-Use Interface**: Simple and intuitive Gradio web interface - **Real-time Generation**: Fast speech synthesis with progress tracking ## 🎵 Available Voices | Voice ID | Gender | Description | |----------|--------|-------------| | `expr-voice-2-m` | Male | Male voice variant 2 | | `expr-voice-2-f` | Female | Female voice variant 2 | | `expr-voice-3-m` | Male | Male voice variant 3 | | `expr-voice-3-f` | Female | Female voice variant 3 | | `expr-voice-4-m` | Male | Male voice variant 4 | | `expr-voice-4-f` | Female | Female voice variant 4 | | `expr-voice-5-m` | Male | Male voice variant 5 | | `expr-voice-5-f` | Female | Female voice variant 5 | ## 🛠️ Usage 1. **Enter Text**: Type or paste your text in the input box 2. **Select Voice**: Choose from the dropdown menu of available voices 3. **Generate**: Click the "Generate Speech" button or press Enter 4. **Download**: Play the generated audio or download it ## 💻 Technical Details - **Model**: [KittenML/kitten-tts-nano-0.1](https://huggingface.co/KittenML/kitten-tts-nano-0.1) - **Sample Rate**: 24kHz - **Framework**: KittenTTS - **Interface**: Gradio - **Audio Format**: WAV (24kHz, mono) ## 🔧 Local Development To run this locally: ```bash # Clone the repository git clone cd # Install dependencies pip install -r requirements.txt # Run the application python app.py ``` ## 📦 Dependencies - `gradio>=4.0.0` - Web interface - `kittentts` - TTS framework - `soundfile` - Audio file handling - `numpy` - Numerical operations - `torch` - PyTorch backend - `torchaudio` - Audio processing - `transformers` - Hugging Face transformers - `accelerate` - Model acceleration ## 🤝 Contributing Feel free to contribute by: - Reporting bugs - Suggesting new features - Improving the UI - Adding more voice options ## 📄 License This project uses the KittenTTS model. Please refer to the original model's license for usage terms. ## 🙏 Acknowledgments - [KittenML](https://huggingface.co/KittenML) for the TTS model - [Hugging Face](https://huggingface.co) for the Spaces platform - [Gradio](https://gradio.app) for the web interface framework --- **Note**: This is a demonstration of the KittenTTS model. For production use, please ensure compliance with the model's license and terms of use.