Spaces:
Running
Running
bigwolfeman
Claude
commited on
Commit
·
b5f2489
1
Parent(s):
da05455
Add Text-to-Speech (TTS) documentation to demo vault
Browse files- Added comprehensive TTS setup guide in seed.py
- Documents ElevenLabs API key and voice ID configuration
- Includes voice recommendations (Rachel, Adam, Antoni, etc.)
- Covers HuggingFace Spaces setup instructions
- Added troubleshooting section for common TTS errors
- Referenced from Getting Started note
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
- backend/src/services/seed.py +106 -0
backend/src/services/seed.py
CHANGED
|
@@ -38,6 +38,7 @@ Use this vault as a shared memory substrate between your local coding agents and
|
|
| 38 |
- **Wikilinks**: Link between notes using `[[Note Name]]` syntax
|
| 39 |
- **Full-Text Search**: Powered by SQLite FTS5 with BM25 ranking
|
| 40 |
- **Interactive Graph**: Visualize your vault's connections (Toggle via top-right menu)
|
|
|
|
| 41 |
- **MCP Integration**: AI agents can read and write docs via [[MCP Integration]]
|
| 42 |
- **Multi-Tenant**: Each user has an isolated vault
|
| 43 |
|
|
@@ -1108,6 +1109,111 @@ When upgrading from 0.13.x to 0.14.x:
|
|
| 1108 |
- [[Architecture Overview]] - System design
|
| 1109 |
- [[Search Features]] - SQLite FTS5 indexing
|
| 1110 |
- Official LlamaIndex docs: https://docs.llamaindex.ai/"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1111 |
},
|
| 1112 |
{
|
| 1113 |
"path": "The Commit Keeper.md",
|
|
|
|
| 38 |
- **Wikilinks**: Link between notes using `[[Note Name]]` syntax
|
| 39 |
- **Full-Text Search**: Powered by SQLite FTS5 with BM25 ranking
|
| 40 |
- **Interactive Graph**: Visualize your vault's connections (Toggle via top-right menu)
|
| 41 |
+
- **Text-to-Speech**: Listen to your notes with ElevenLabs AI voices - see [[Text-to-Speech (TTS)]]
|
| 42 |
- **MCP Integration**: AI agents can read and write docs via [[MCP Integration]]
|
| 43 |
- **Multi-Tenant**: Each user has an isolated vault
|
| 44 |
|
|
|
|
| 1109 |
- [[Architecture Overview]] - System design
|
| 1110 |
- [[Search Features]] - SQLite FTS5 indexing
|
| 1111 |
- Official LlamaIndex docs: https://docs.llamaindex.ai/"""
|
| 1112 |
+
},
|
| 1113 |
+
{
|
| 1114 |
+
"path": "Text-to-Speech (TTS).md",
|
| 1115 |
+
"title": "Text-to-Speech (TTS)",
|
| 1116 |
+
"tags": ["tts", "elevenlabs", "audio", "accessibility", "features"],
|
| 1117 |
+
"body": """# Text-to-Speech (TTS)
|
| 1118 |
+
|
| 1119 |
+
The Document Viewer includes an integrated **Text-to-Speech** feature powered by **ElevenLabs API**, allowing you to listen to your documentation instead of reading it.
|
| 1120 |
+
|
| 1121 |
+
## Features
|
| 1122 |
+
|
| 1123 |
+
- **High-Quality Voices**: Professional AI voices from ElevenLabs
|
| 1124 |
+
- **Note Reading**: Play audio version of any note with a single click
|
| 1125 |
+
- **Playback Controls**: Play, pause, and stop functionality
|
| 1126 |
+
- **Seamless Integration**: TTS button appears in note viewer toolbar
|
| 1127 |
+
|
| 1128 |
+
## Setup (Required for Self-Hosting)
|
| 1129 |
+
|
| 1130 |
+
The TTS feature requires two environment variables to be configured:
|
| 1131 |
+
|
| 1132 |
+
### 1. Get an ElevenLabs API Key
|
| 1133 |
+
|
| 1134 |
+
1. Sign up at https://elevenlabs.io
|
| 1135 |
+
2. Navigate to your profile settings
|
| 1136 |
+
3. Copy your API key
|
| 1137 |
+
4. Set the environment variable: `ELEVENLABS_API_KEY=your-api-key-here`
|
| 1138 |
+
|
| 1139 |
+
### 2. Choose a Voice ID
|
| 1140 |
+
|
| 1141 |
+
ElevenLabs provides several high-quality pre-made voices. Choose one and set the `ELEVENLABS_VOICE_ID` environment variable.
|
| 1142 |
+
|
| 1143 |
+
**Recommended Voice IDs:**
|
| 1144 |
+
|
| 1145 |
+
| Voice Name | ID | Description |
|
| 1146 |
+
|------------|----|----|
|
| 1147 |
+
| **Rachel** (recommended) | `21m00Tcm4TlvDq8ikWAM` | American female, calm and clear |
|
| 1148 |
+
| **Adam** | `pNInz6obpgDQGcFmaJgB` | American male, deep and authoritative |
|
| 1149 |
+
| **Antoni** | `ErXwobaYiN019PkySvjV` | American male, well-rounded |
|
| 1150 |
+
| **Josh** | `TxGEqnHWrfWFTfGW9XjX` | American male, young and energetic |
|
| 1151 |
+
| **Bella** | `EXAVITQu4vr4xnSDxMaL` | American female, soft and pleasant |
|
| 1152 |
+
| **Elli** | `MF3mGyEYCl7XYWbV9V6O` | American female, emotional and expressive |
|
| 1153 |
+
|
| 1154 |
+
**Example `.env` configuration:**
|
| 1155 |
+
```bash
|
| 1156 |
+
ELEVENLABS_API_KEY=sk_abc123...
|
| 1157 |
+
ELEVENLABS_VOICE_ID=21m00Tcm4TlvDq8ikWAM
|
| 1158 |
+
```
|
| 1159 |
+
|
| 1160 |
+
### 3. For HuggingFace Spaces
|
| 1161 |
+
|
| 1162 |
+
If deploying to HuggingFace Spaces:
|
| 1163 |
+
|
| 1164 |
+
1. Go to your Space **Settings** → **Variables and secrets**
|
| 1165 |
+
2. Add two **secrets**:
|
| 1166 |
+
- **Name:** `ELEVENLABS_API_KEY`, **Value:** your API key
|
| 1167 |
+
- **Name:** `ELEVENLABS_VOICE_ID`, **Value:** `21m00Tcm4TlvDq8ikWAM` (or your chosen voice)
|
| 1168 |
+
3. Restart your Space for changes to take effect
|
| 1169 |
+
|
| 1170 |
+
## How to Use
|
| 1171 |
+
|
| 1172 |
+
1. **Open any note** in the Document Viewer
|
| 1173 |
+
2. **Click the speaker icon** in the toolbar (appears next to Edit button)
|
| 1174 |
+
3. **Wait for synthesis** - the text is converted to speech (may take a few seconds)
|
| 1175 |
+
4. **Audio plays automatically** once ready
|
| 1176 |
+
5. **Use controls** to pause, resume, or stop playback
|
| 1177 |
+
|
| 1178 |
+
The TTS feature converts markdown to plain text before synthesis, removing formatting characters for natural-sounding speech.
|
| 1179 |
+
|
| 1180 |
+
## Technical Details
|
| 1181 |
+
|
| 1182 |
+
- **API Endpoint**: `POST /api/tts/synthesize`
|
| 1183 |
+
- **Implementation**: `backend/src/api/routes/tts.py`
|
| 1184 |
+
- **Frontend**: TTS controls in `NoteViewer.tsx`
|
| 1185 |
+
- **Model**: ElevenLabs multilingual v2 model
|
| 1186 |
+
- **Format**: MP3 audio stream
|
| 1187 |
+
|
| 1188 |
+
The backend strips markdown formatting and converts the content to plain text before sending to ElevenLabs, ensuring the audio sounds natural without reading asterisks, brackets, or other markdown syntax.
|
| 1189 |
+
|
| 1190 |
+
## Rate Limits
|
| 1191 |
+
|
| 1192 |
+
ElevenLabs free tier provides:
|
| 1193 |
+
- 10,000 characters per month
|
| 1194 |
+
- Sufficient for occasional documentation reading
|
| 1195 |
+
- Upgrade to paid tier for higher limits
|
| 1196 |
+
|
| 1197 |
+
## Troubleshooting
|
| 1198 |
+
|
| 1199 |
+
**"Voice ID is required" error:**
|
| 1200 |
+
- Ensure `ELEVENLABS_VOICE_ID` is set in environment variables
|
| 1201 |
+
- Restart the server/Space after adding the variable
|
| 1202 |
+
|
| 1203 |
+
**Audio doesn't play:**
|
| 1204 |
+
- Check browser console for errors
|
| 1205 |
+
- Verify `ELEVENLABS_API_KEY` is valid
|
| 1206 |
+
- Ensure your browser supports HTML5 audio
|
| 1207 |
+
|
| 1208 |
+
**Poor audio quality:**
|
| 1209 |
+
- Try a different voice ID from the list above
|
| 1210 |
+
- Check if the note content has unusual characters that might affect synthesis
|
| 1211 |
+
|
| 1212 |
+
## See Also
|
| 1213 |
+
|
| 1214 |
+
- [[Architecture Overview]] - System design
|
| 1215 |
+
- [[Self Hosting]] - Deployment guide
|
| 1216 |
+
- [[Getting Started]] - First steps"""
|
| 1217 |
},
|
| 1218 |
{
|
| 1219 |
"path": "The Commit Keeper.md",
|