bigwolfeman Claude commited on
Commit
b5f2489
·
1 Parent(s): da05455

Add Text-to-Speech (TTS) documentation to demo vault

Browse files

- Added comprehensive TTS setup guide in seed.py
- Documents ElevenLabs API key and voice ID configuration
- Includes voice recommendations (Rachel, Adam, Antoni, etc.)
- Covers HuggingFace Spaces setup instructions
- Added troubleshooting section for common TTS errors
- Referenced from Getting Started note

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>

Files changed (1) hide show
  1. backend/src/services/seed.py +106 -0
backend/src/services/seed.py CHANGED
@@ -38,6 +38,7 @@ Use this vault as a shared memory substrate between your local coding agents and
38
  - **Wikilinks**: Link between notes using `[[Note Name]]` syntax
39
  - **Full-Text Search**: Powered by SQLite FTS5 with BM25 ranking
40
  - **Interactive Graph**: Visualize your vault's connections (Toggle via top-right menu)
 
41
  - **MCP Integration**: AI agents can read and write docs via [[MCP Integration]]
42
  - **Multi-Tenant**: Each user has an isolated vault
43
 
@@ -1108,6 +1109,111 @@ When upgrading from 0.13.x to 0.14.x:
1108
  - [[Architecture Overview]] - System design
1109
  - [[Search Features]] - SQLite FTS5 indexing
1110
  - Official LlamaIndex docs: https://docs.llamaindex.ai/"""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1111
  },
1112
  {
1113
  "path": "The Commit Keeper.md",
 
38
  - **Wikilinks**: Link between notes using `[[Note Name]]` syntax
39
  - **Full-Text Search**: Powered by SQLite FTS5 with BM25 ranking
40
  - **Interactive Graph**: Visualize your vault's connections (Toggle via top-right menu)
41
+ - **Text-to-Speech**: Listen to your notes with ElevenLabs AI voices - see [[Text-to-Speech (TTS)]]
42
  - **MCP Integration**: AI agents can read and write docs via [[MCP Integration]]
43
  - **Multi-Tenant**: Each user has an isolated vault
44
 
 
1109
  - [[Architecture Overview]] - System design
1110
  - [[Search Features]] - SQLite FTS5 indexing
1111
  - Official LlamaIndex docs: https://docs.llamaindex.ai/"""
1112
+ },
1113
+ {
1114
+ "path": "Text-to-Speech (TTS).md",
1115
+ "title": "Text-to-Speech (TTS)",
1116
+ "tags": ["tts", "elevenlabs", "audio", "accessibility", "features"],
1117
+ "body": """# Text-to-Speech (TTS)
1118
+
1119
+ The Document Viewer includes an integrated **Text-to-Speech** feature powered by **ElevenLabs API**, allowing you to listen to your documentation instead of reading it.
1120
+
1121
+ ## Features
1122
+
1123
+ - **High-Quality Voices**: Professional AI voices from ElevenLabs
1124
+ - **Note Reading**: Play audio version of any note with a single click
1125
+ - **Playback Controls**: Play, pause, and stop functionality
1126
+ - **Seamless Integration**: TTS button appears in note viewer toolbar
1127
+
1128
+ ## Setup (Required for Self-Hosting)
1129
+
1130
+ The TTS feature requires two environment variables to be configured:
1131
+
1132
+ ### 1. Get an ElevenLabs API Key
1133
+
1134
+ 1. Sign up at https://elevenlabs.io
1135
+ 2. Navigate to your profile settings
1136
+ 3. Copy your API key
1137
+ 4. Set the environment variable: `ELEVENLABS_API_KEY=your-api-key-here`
1138
+
1139
+ ### 2. Choose a Voice ID
1140
+
1141
+ ElevenLabs provides several high-quality pre-made voices. Choose one and set the `ELEVENLABS_VOICE_ID` environment variable.
1142
+
1143
+ **Recommended Voice IDs:**
1144
+
1145
+ | Voice Name | ID | Description |
1146
+ |------------|----|----|
1147
+ | **Rachel** (recommended) | `21m00Tcm4TlvDq8ikWAM` | American female, calm and clear |
1148
+ | **Adam** | `pNInz6obpgDQGcFmaJgB` | American male, deep and authoritative |
1149
+ | **Antoni** | `ErXwobaYiN019PkySvjV` | American male, well-rounded |
1150
+ | **Josh** | `TxGEqnHWrfWFTfGW9XjX` | American male, young and energetic |
1151
+ | **Bella** | `EXAVITQu4vr4xnSDxMaL` | American female, soft and pleasant |
1152
+ | **Elli** | `MF3mGyEYCl7XYWbV9V6O` | American female, emotional and expressive |
1153
+
1154
+ **Example `.env` configuration:**
1155
+ ```bash
1156
+ ELEVENLABS_API_KEY=sk_abc123...
1157
+ ELEVENLABS_VOICE_ID=21m00Tcm4TlvDq8ikWAM
1158
+ ```
1159
+
1160
+ ### 3. For HuggingFace Spaces
1161
+
1162
+ If deploying to HuggingFace Spaces:
1163
+
1164
+ 1. Go to your Space **Settings** → **Variables and secrets**
1165
+ 2. Add two **secrets**:
1166
+ - **Name:** `ELEVENLABS_API_KEY`, **Value:** your API key
1167
+ - **Name:** `ELEVENLABS_VOICE_ID`, **Value:** `21m00Tcm4TlvDq8ikWAM` (or your chosen voice)
1168
+ 3. Restart your Space for changes to take effect
1169
+
1170
+ ## How to Use
1171
+
1172
+ 1. **Open any note** in the Document Viewer
1173
+ 2. **Click the speaker icon** in the toolbar (appears next to Edit button)
1174
+ 3. **Wait for synthesis** - the text is converted to speech (may take a few seconds)
1175
+ 4. **Audio plays automatically** once ready
1176
+ 5. **Use controls** to pause, resume, or stop playback
1177
+
1178
+ The TTS feature converts markdown to plain text before synthesis, removing formatting characters for natural-sounding speech.
1179
+
1180
+ ## Technical Details
1181
+
1182
+ - **API Endpoint**: `POST /api/tts/synthesize`
1183
+ - **Implementation**: `backend/src/api/routes/tts.py`
1184
+ - **Frontend**: TTS controls in `NoteViewer.tsx`
1185
+ - **Model**: ElevenLabs multilingual v2 model
1186
+ - **Format**: MP3 audio stream
1187
+
1188
+ The backend strips markdown formatting and converts the content to plain text before sending to ElevenLabs, ensuring the audio sounds natural without reading asterisks, brackets, or other markdown syntax.
1189
+
1190
+ ## Rate Limits
1191
+
1192
+ ElevenLabs free tier provides:
1193
+ - 10,000 characters per month
1194
+ - Sufficient for occasional documentation reading
1195
+ - Upgrade to paid tier for higher limits
1196
+
1197
+ ## Troubleshooting
1198
+
1199
+ **"Voice ID is required" error:**
1200
+ - Ensure `ELEVENLABS_VOICE_ID` is set in environment variables
1201
+ - Restart the server/Space after adding the variable
1202
+
1203
+ **Audio doesn't play:**
1204
+ - Check browser console for errors
1205
+ - Verify `ELEVENLABS_API_KEY` is valid
1206
+ - Ensure your browser supports HTML5 audio
1207
+
1208
+ **Poor audio quality:**
1209
+ - Try a different voice ID from the list above
1210
+ - Check if the note content has unusual characters that might affect synthesis
1211
+
1212
+ ## See Also
1213
+
1214
+ - [[Architecture Overview]] - System design
1215
+ - [[Self Hosting]] - Deployment guide
1216
+ - [[Getting Started]] - First steps"""
1217
  },
1218
  {
1219
  "path": "The Commit Keeper.md",