Audio Spaces
-
71π
-
951
Seamless M4T
π -
5.07k
MusicGen
π΅Generate music from text descriptions and optional melodies
-
812
Audioldm Text To Audio Generation
πGenerate audio from text descriptions
-
308
AudioLDM2 Text2Audio Text2Music Generation
πGenerate audio and waveform video from text
-
222
AudioSep
π -
170
Lp Music Caps
π΅Generate captions for music audio
-
311
Tortoise Tts
π’ExpressivText-to-Speech
-
22
All In One
π -
2.77k
XTTS
πΈGenerate speech from text using a reference voice
-
189
Coqui Bark Voice Cloning
πΈ -
367
VALL E X
πGenerate audio from text using voice prompts
-
193
WavJourney
π₯ -
264
Music To Image
πΆ -
277
MMS
πTransform and identify speech with MMS
-
608
ElevenLabs TTS
π£Generate voice from text using ElevenLabs
-
289
AudioGPT
π -
2.38k
Bark
πΆGenerate realistic audio from text
-
36
SpeechT5 Speech Recognition Demo
π© -
174
CoquiTTS (Official)
πΈ -
2.58k
Whisper
πTranscribe audio files or YouTube videos into text
-
658
Moe TTS
πGenerate and convert voice with text and audio inputs
-
17
YourTTS
π₯ -
557
Talking Face Generation with Multilingual TTS
πGenerate a talking face video from text in multiple languages
-
562
OpenAI TTS New
π -
167
Mustango
π’ -
55
OWSM Demo
π -
698
StyleTTS 2
π£Efficient, fast, and natural text to speech with StyleTTS 2!
-
400
HierSpeech++ (Zero-shot TTS)
β‘Generate high-quality speech from text using a prompt audio
-
21
Video2music
πGenerate music for a video based on its content and key
-
187
Whisper Large V2
π€« -
64
Musicgen Prompt Upsampling
πGenerate music from text prompts πΆ
-
516
Seamless M4T v2
πTranslate speech and text between languages
-
318
Seamless Streaming
πTranslate text between languages
-
52
Matcha TTS
π΅Generate speech from text with speaker selection
-
276
MusicGen Streaming
π₯Generate music from text prompts
-
415
Resemble Enhance
πEnhance and denoise your audio files
-
260
Singing Voice Conversion
πΌTransform your voice into a singer's
-
52
NaturalSpeech2
π§Generate speech with cloned timbre
-
21
Create Your Own TTS Dataset
π₯ -
Podcast Transcription
π’ -
1.1k
OpenVoice
π€Generate voice from text using a reference audio
-
94
M2UGen Demo
π» -
68
Pheme
π -
7
ESPnet2 TTS
πConvert text to speech in English, Chinese, or Japanese
-
37
Whisper-WebUI
πGenerate subtitles and translate audio files
-
173
Image2SFX Comparison
πGenerates audio environment from an image
-
379
WhisperSpeech
π¬ -
144
MetaVoice 1B
π£A demo of MetaVoice 1B, a new TTS model by MetaVoice.
-
893
TTS Arena V2
πVote on the latest TTS models!
-
173
Whisper Speech X DreamTalk
π½Combine voice cloning and portrait lipsync animation
-
197
Canary 1b
π€Transcribe and translate audio into text
-
81
SALMONN Audio Questioning
β‘Deeply interrogate audio file content
-
467
MeloTTS
π£Fast, efficient, & multilingual text-to-speech
-
311
Audio Editing
π§Edit audios with text prompts
-
18
ChatMusician
π» -
73
xVASynth TTS
π§CPU powered, low RTF, emotional, multilingual TTS
-
180
NaturalSpeech3 FACodec
πConvert and reconstruct speech files
-
25
Hey Gemma
β -
70
Ratchet + Whisper
π£Convert audio to text
-
3
AutoSubs
πAutomatically add on-screen subs to your videos
-
161
VoiceCraft
π -
321
TangoFlux
πText to Audio (Sound SFX) Generator
-
826
Parler-TTS
π₯High-fidelity Text-To-Speech
-
184
Sing an idea β‘οΈ Music
π₯Bring song ideas to life
-
75
Musicgen Songstarter Demo
πGenerate music using descriptions and optional melody audio
-
145
Whisper JAX
πTranscribe or translate audio from microphone, file, or YouTube
-
22
AudioLCM
π’Generate audio from text
-
160
Stable Audio Live Multiplayer
π»Generate audio from text prompts
-
447
Stable Audio Open Zero
π₯Generate audio from text prompts
-
13
Make An Audio 3
πGenerate audio from text prompts
-
60
Mars5 Space
π -
5
Tango Music AF
π΅Text to Music Generator
-
16
Jam
πGenerate a song from lyrics and style reference
-
107
BigVGAN
πGenerate high-quality audio from input audio
-
89
SenseVoice
πTranscribe audio with emotions and events
-
29
PicoAudio
πGenerate audio from text descriptions with timestamps
-
7
Audio Flamingo Demo
π -
29
MusiConGen
πͺ© -
20
Mms Zeroshot
πTranscribe audio in any language using text data
-
200
GPT SoVITS V2 Pro Plus
π€Generate speech from text using reference audio
-
274
EzAudio
π£Generate and edit audio from text prompts
-
214
OpenMusic
πΆGenerate music from text descriptions
-
545
Midi Music Generator
πΌGenerate MIDI music from prompts
-
987
Whisper Turbo
π€―Transcribe audio or YouTube videos into text
-
338
Realtime Whisper Turbo
π€―Realtime implementation of Whisper large turbo
-
163
Whisper Large V3 Turbo WebGPU
πML-powered speech recognition directly in your browser
-
653
OpenAudio S1
πGenerate speech from text
-
445
TTS Spaces Arena
π€Blind vote on HF TTS models!
-
19
Diva Realtime Chat
π£Generate text responses from audio input
-
2.65k
F5-TTS
π£F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
-
260
MaskGCT TTS Demo
π»MaskGCT TTS Demo
-
129
MelodyFlow
π΅Generate music from text descriptions
-
146
Fish Agent
π¬An end-to-end (e2e) Voice Language Model by Fish Audio.
-
64
Nexa Omni Demo
π§Generate text from audio input
-
2.99k
Kokoro TTS
β€Upgraded to v1.0!
-
117
Make Custom Voices With KokoroTTS
β‘Make Custom Voices With KokoroTTS
-
310
Llasa 3b Tts
π₯Zero Shot voice cloning with llasa 3b (Unofficial Demo)
-
12
Llasa 1b Multilingual TTS
πGenerate speech from text with or without cloning a voice
-
344
Kokoro Text-to-Speech (WebGPU)
π£High-quality speech synthesis powered by Kokoro TTS
-
42
Hibiki Simple
πHigh-Fidelity Simultaneous Speech-To-Speech Translation
-
407
Zonos
πGenerate audio from text with customizable emotions and settings
-
75
Kokoro Web
π£ML-powered speech synthesis directly in your browser
-
644
DiβͺβͺRhythm
πΆBlazingly Fast and Embarrassingly Simple Song Generation
-
22
Audiobox Aesthetics
πDemo for audiobox-aesthetics
-
229
Spark TTS
πA text-to-speech model powered by SparkAudio and Mobvoi.
-
844
Sesame CSM
π±Conversational speech generation
-
238
Orpheus TTS
πTry Orpheus TTS here
-
42
Canary 1B Flash
π€Canary 1B Flash demo
-
216
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
πGenerate speech from text using a reference audio
-
6
AudioMorphix
πPrepare environment and run Gradio app
-
93
MegaTTS3 Demo
π -
155
AudioX
πGenerate audio from text and video prompts
-
100
Vevo for Zero-shot VC, TTS, and More
πControllable Zero-Shot Voice Imitation
-
1.7k
Dia 1.6B
π―Generate realistic dialogue from a script, using Dia!
-
43
Aero 1 Audio Demo
π¬Demo for Aero-1-Audio
-
44
Voila Demo
π»Chat with a voice-clone AI
-
589
ACE Step
π»A Step Towards Music Generation Foundation Model
-
2
Audio Difficulty Estimator
πΉEstimate piano difficulty from audio
-
105
TIGER Audio Extractor
βExtraction & Reconstruction for Efficient Speech Separation
-
14
Music2emo
πTowards Unified Music Emotion Recognition across Dimensional
-
13
SonicVerse
πΌGenerate detailed music descriptions from audio clips
-
39
Auffusion
π»Audio Gen, Audio Style Transfer and Audio InPainting
-
1.58k
Chatterbox TTS
πΏExpressive Zeroshot TTS
-
117
PlayDiffusion
π¨Generate modified audio from text and voice
-
2
Voice Clone Arena
πVote on the latest Voice Clone TTS models!
-
219
Conversational WebGPU
π -
462
Song Generation
π΅Generate a custom song from lyrics and optional prompts
-
54
NotaGen
πGenerate classical sheet music in ABC notation
-
81
Audio Flamingo 3 Demo
πAudio Flamingo 3 Demo
-
33
Audio Flamingo 3 Chat
πAudio Flamingo 3 demo for multi-turn multi-audio chat
-
6
MSR UTMOS
π’Multiple sampling rate MOS prediction with SFI conv
-
384
Higgs Audio Demo
π€Higgs Audio Demo
-
15
sidon_demo_beta
πSpeech restoration demo of Sidon.
-
65
Canary 1b V2
π€Transcribe and Translate in 25 European Languages
-
17
SonicMaster β Text-Guided Music Restoration & Mastering
π§Enhance audio using text prompts
-
6
OLMoASR
πOpen Models and Data for Training Robust Speech Recognition
-
85
VibeVoice-Large
πGenerate a podcast audio from a script and voice samples
-
10
TaDiCodec TTS AR Qwen2.5 0.5B
πGenerate speech from text with voice cloning
-
8
EchoX
π₯An end-to-end speech large language model.
-
43
VoxCPM 0.5B
π’Generate expressive speech from text with optional voice cloning
-
34
FireRedTTS2
π₯Long-form multi-speaker dialogue generation
-
3
FireRedASR
πFireRedASR Demo
-
473
IndexTTS 2 Demo
π’Generate expressive speech from text with emotion control
-
8
SongFormer
π΅State-of-the-art music analysis with multi-scale datasets
-
10
EmoAct MiMo
π΅Controllable emotional TTS