bigwolfe commited on
Commit
687dcda
Β·
1 Parent(s): 398d547

Fix HF Spaces README configuration

Browse files
Files changed (2) hide show
  1. .gitignore +1 -0
  2. README.md +77 -314
.gitignore CHANGED
@@ -52,3 +52,4 @@ data/
52
 
53
  reproduce_auth_500.py
54
  debug_list_notes.py
 
 
52
 
53
  reproduce_auth_500.py
54
  debug_list_notes.py
55
+ DEPLOY_TO_HF.md
README.md CHANGED
@@ -1,358 +1,121 @@
1
- # Document Viewer
2
-
3
- A multi-tenant Obsidian-like documentation system with AI agent integration via Model Context Protocol (MCP).
4
-
5
- ## 🎯 Overview
6
-
7
- Document Viewer enables both humans and AI agents to create, browse, and search documentation with powerful features like:
8
-
9
- - πŸ“ **Markdown Notes** with YAML frontmatter
10
- - πŸ”— **Wikilinks** - `[[Note Name]]` style internal linking with auto-resolution
11
- - πŸ” **Full-Text Search** - BM25 ranking with recency bonus
12
- - ↩️ **Backlinks** - Automatic tracking of which notes reference each other
13
- - 🏷️ **Tags** - Organize notes with frontmatter tags
14
- - ✏️ **Split-Pane Editor** - Live markdown preview with optimistic concurrency
15
- - πŸ€– **MCP Integration** - AI agents can read/write docs via FastMCP
16
- - πŸ‘₯ **Multi-Tenant** - Isolated vaults per user (production ready with HF OAuth)
17
-
18
- ## πŸ—οΈ Tech Stack
19
-
20
- ### Backend
21
- - **FastAPI** - HTTP API server
22
- - **FastMCP** - MCP server for AI agent integration
23
- - **SQLite FTS5** - Full-text search with BM25 ranking
24
- - **python-frontmatter** - YAML frontmatter parsing
25
- - **PyJWT** - Token-based authentication
26
-
27
- ### Frontend
28
- - **React + Vite** - Modern web framework
29
- - **shadcn/ui** - Beautiful UI components
30
- - **Tailwind CSS** - Utility-first styling
31
- - **react-markdown** - Markdown rendering with custom wikilink support
32
- - **TypeScript** - Type-safe frontend code
33
-
34
- ## πŸ“¦ Local Setup
35
-
36
- ### Prerequisites
37
- - Python 3.11+
38
- - Node.js 18+
39
- - `uv` (Python package manager) or `pip`
40
-
41
- ### 1. Clone Repository
42
-
43
- ```bash
44
- git clone <repository-url>
45
- cd Document-MCP
46
- ```
47
-
48
- ### 2. Backend Setup
49
-
50
- ```bash
51
- cd backend
52
-
53
- # Create virtual environment
54
- uv venv
55
- # or: python -m venv .venv
56
-
57
- # Install dependencies
58
- uv pip install -e .
59
- # or: .venv/bin/pip install -e .
60
-
61
- # Initialize database
62
- cd ..
63
- VIRTUAL_ENV=backend/.venv backend/.venv/bin/python -c "from backend.src.services.database import init_database; init_database()"
64
- ```
65
-
66
- ### 3. Frontend Setup
67
-
68
- ```bash
69
- cd frontend
70
-
71
- # Install dependencies
72
- npm install
73
- ```
74
-
75
- ### 4. Environment Configuration
76
-
77
- The project includes development scripts that set environment variables automatically. For manual configuration, create a `.env` file in the backend directory:
78
-
79
- ```bash
80
- # backend/.env
81
- JWT_SECRET_KEY=your-secret-key-here
82
- VAULT_BASE_PATH=/path/to/Document-MCP/data/vaults
83
- ```
84
-
85
- See `.env.example` for all available options.
86
-
87
- ## πŸš€ Running the Application
88
-
89
- ### Easy Start (Recommended)
90
-
91
- Use the provided scripts to start both servers:
92
-
93
- ```bash
94
- # Start frontend and backend
95
- ./start-dev.sh
96
-
97
- # Check status
98
- ./status-dev.sh
99
-
100
- # Stop servers
101
- ./stop-dev.sh
102
-
103
- # View logs
104
- tail -f backend.log frontend.log
105
- ```
106
-
107
- ### Manual Start
108
-
109
- #### Running Backend
110
-
111
- Start the HTTP API server:
112
-
113
- ```bash
114
- cd backend
115
- JWT_SECRET_KEY="local-dev-secret-key-123" \
116
- VAULT_BASE_PATH="$(pwd)/../data/vaults" \
117
- .venv/bin/uvicorn main:app --host 0.0.0.0 --port 8000 --reload
118
- ```
119
-
120
- Backend will be available at: `http://localhost:8000`
121
 
122
- API docs (Swagger): `http://localhost:8000/docs`
123
 
124
- #### Running MCP Server (STDIO Mode)
125
 
126
- For AI agent integration via MCP:
127
 
128
- ```bash
129
- cd backend
130
- JWT_SECRET_KEY="local-dev-secret-key-123" \
131
- VAULT_BASE_PATH="$(pwd)/../data/vaults" \
132
- .venv/bin/python -m src.mcp.server
133
- ```
134
 
135
- #### Running Frontend
 
 
136
 
137
- ```bash
138
- cd frontend
139
- npm run dev
140
- ```
141
 
142
- Frontend will be available at: `http://localhost:5173`
 
 
 
 
 
143
 
144
- ## πŸ€– MCP Client Configuration
145
 
146
- To use the Document Viewer with AI agents (Claude Desktop, Cline, etc.), add this to your MCP configuration:
 
 
 
147
 
148
- ### Claude Desktop / Cline
149
 
150
- Add to `~/.cursor/mcp.json` (or Claude Desktop settings):
151
 
152
  ```json
153
  {
154
  "mcpServers": {
155
  "obsidian-docs": {
156
- "command": "python",
157
- "args": ["-m", "backend.src.mcp.server"],
158
- "cwd": "/path/to/Document-MCP",
159
- "env": {
160
- "BEARER_TOKEN": "local-dev-token",
161
- "FASTMCP_SHOW_CLI_BANNER": "false",
162
- "PYTHONPATH": "/path/to/Document-MCP",
163
- "JWT_SECRET_KEY": "local-dev-secret-key-123",
164
- "VAULT_BASE_PATH": "/path/to/Document-MCP/data/vaults"
165
  }
166
  }
167
  }
168
  }
169
  ```
170
 
171
- **Note:** In production, use actual JWT tokens instead of `local-dev-token`.
172
-
173
- ### Available MCP Tools
174
-
175
- AI agents can use these tools:
176
 
177
- - `list_notes` - List all notes in vault
178
- - `read_note` - Read a specific note
179
- - `write_note` - Create or update a note
180
- - `delete_note` - Remove a note
181
- - `search_notes` - Full-text search with BM25 ranking
182
- - `get_backlinks` - Find notes linking to a target
183
- - `get_tags` - List all tags with usage counts
184
 
185
- ## πŸ›οΈ Architecture
186
-
187
- ### Data Model
188
-
189
- **Note Structure:**
190
- ```yaml
191
- ---
192
- title: My Note
193
- tags: [guide, tutorial]
194
- created: 2025-01-15T10:00:00Z
195
- updated: 2025-01-15T14:30:00Z
196
- ---
197
-
198
- # My Note
199
-
200
- Content with [[Wikilinks]] to other notes.
201
- ```
202
-
203
- **Vault Structure:**
204
- ```
205
- data/vaults/
206
- β”œβ”€β”€ local-dev/ # Development user vault
207
- β”‚ β”œβ”€β”€ Getting Started.md
208
- β”‚ β”œβ”€β”€ API Documentation.md
209
- β”‚ └── ...
210
- └── {user_id}/ # Production user vaults
211
- └── *.md
212
- ```
213
 
214
- **Index Tables (SQLite):**
215
- - `note_metadata` - Note versions, titles, timestamps
216
- - `note_fts` - FTS5 full-text search index
217
- - `note_tags` - Tag associations
218
- - `note_links` - Wikilink graph (resolved/unresolved)
219
- - `index_health` - Index statistics per user
220
-
221
- ### Key Features
222
-
223
- **Wikilink Resolution:**
224
- - Normalizes titles to slugs: `[[Getting Started]]` β†’ `getting-started`
225
- - Matches against both title and filename
226
- - Prefers same-folder matches
227
- - Tracks broken links for UI styling
228
-
229
- **Search Ranking:**
230
- - BM25 algorithm with title-weighted scoring (3x title, 1x body)
231
- - Recency bonus: +1.0 for notes updated in last 7 days, +0.5 for last 30 days
232
- - Returns highlighted snippets with `<mark>` tags
233
-
234
- **Optimistic Concurrency:**
235
- - Version-based conflict detection for note edits
236
- - Prevents data loss from concurrent edits
237
- - Returns 409 Conflict with helpful message
238
-
239
- ## πŸ”’ Authentication
240
-
241
- ### Local Development
242
- Uses a static token: `local-dev-token`
243
-
244
- ### Production (Hugging Face OAuth)
245
- - Multi-tenant with per-user isolated vaults
246
- - JWT tokens with user_id claims
247
- - Automatic vault initialization on first login
248
-
249
- See deployment documentation for HF OAuth setup.
250
-
251
- ## πŸ“Š Performance Considerations
252
-
253
- **SQLite Optimizations:**
254
- - FTS5 with prefix indexes (`prefix='2 3'`) for fast autocomplete and substring matching
255
- - Recommended: Enable WAL mode for concurrent reads/writes:
256
- ```sql
257
- PRAGMA journal_mode=WAL;
258
- PRAGMA synchronous=NORMAL;
259
- ```
260
- - Normalized slug indexes (`normalized_title_slug`, `normalized_path_slug`) for O(1) wikilink resolution
261
- - BM25 ranking weights: 3.0 for title matches, 1.0 for body matches
262
-
263
- **Rate Limiting:**
264
- - ⚠️ **Production Recommendation**: Add per-user rate limits to prevent abuse
265
- - API endpoints currently have no rate limiting
266
- - Consider implementing:
267
- - `/api/notes` (POST): 100 requests/hour per user
268
- - `/api/index/rebuild` (POST): 10 requests/day per user
269
- - `/api/search`: 1000 requests/hour per user
270
- - Use libraries like `slowapi` or Redis-based rate limiting
271
-
272
- **Scaling:**
273
- - **Single-server**: SQLite handles 100K+ notes efficiently
274
- - **Multi-server**: Migrate to PostgreSQL with `pg_trgm` or `pgvector` for FTS
275
- - **Caching**: Add Redis for:
276
- - Session tokens (reduce DB lookups)
277
- - Frequently accessed notes
278
- - Search result caching (TTL: 5 minutes)
279
- - **CDN**: Serve frontend assets via CDN for global performance
280
-
281
- ## πŸ§ͺ Development
282
-
283
- ### Project Structure
284
 
285
- ```
286
- Document-MCP/
287
- β”œβ”€β”€ backend/
288
- β”‚ β”œβ”€β”€ src/
289
- β”‚ β”‚ β”œβ”€β”€ api/ # FastAPI routes & middleware
290
- β”‚ β”‚ β”œβ”€β”€ mcp/ # FastMCP server
291
- β”‚ β”‚ β”œβ”€β”€ models/ # Pydantic models
292
- β”‚ β”‚ └── services/ # Business logic
293
- β”‚ └── tests/ # Backend tests
294
- β”œβ”€β”€ frontend/
295
- β”‚ β”œβ”€β”€ src/
296
- β”‚ β”‚ β”œβ”€β”€ components/ # React components
297
- β”‚ β”‚ β”œβ”€β”€ pages/ # Page components
298
- β”‚ β”‚ β”œβ”€β”€ services/ # API client
299
- β”‚ β”‚ └── types/ # TypeScript types
300
- β”‚ ��── tests/ # Frontend tests
301
- β”œβ”€β”€ data/
302
- β”‚ β”œβ”€β”€ vaults/ # User markdown files
303
- β”‚ └── index.db # SQLite database
304
- β”œβ”€β”€ specs/ # Feature specifications
305
- └── start-dev.sh # Development startup script
306
- ```
307
 
308
- ### Adding a New Note (via UI)
309
 
310
- 1. Click "New Note" button
311
- 2. Enter note name (`.md` extension optional)
312
- 3. Edit in split-pane editor
313
- 4. Save with Cmd/Ctrl+S
314
 
315
- ### Adding a New Note (via MCP)
 
 
 
 
 
316
 
317
- ```python
318
- # AI agent writes a note
319
- write_note(
320
- path="guides/my-guide.md",
321
- body="# My Guide\n\nContent here with [[links]]",
322
- title="My Guide",
323
- metadata={"tags": ["guide", "tutorial"]}
324
- )
325
- ```
326
 
327
- ## πŸ› Troubleshooting
328
 
329
- **Backend won't start:**
330
- - Ensure virtual environment is activated
331
- - Check environment variables are set
332
- - Verify database is initialized
333
 
334
- **Frontend shows connection errors:**
335
- - Ensure backend is running on port 8000
336
- - Check Vite proxy configuration in `frontend/vite.config.ts`
337
 
338
- **Search returns no results:**
339
- - Verify notes are indexed (check Settings β†’ Index Health)
340
- - Try rebuilding the index via Settings page
341
 
342
- **MCP tools not showing in Claude:**
343
- - Verify MCP configuration path is correct
344
- - Check `PYTHONPATH` includes project root
345
- - Restart Claude Desktop after config changes
346
 
347
  ## πŸ“ License
348
 
349
- [Add license information]
350
 
351
  ## 🀝 Contributing
352
 
353
- [Add contributing guidelines]
354
 
355
- ## πŸ“§ Contact
356
 
357
- [Add contact information]
358
 
 
1
+ ---
2
+ title: Document Viewer
3
+ emoji: πŸ“š
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ pinned: false
8
+ license: mit
9
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
+ # Document Viewer - AI-Powered Documentation System
12
 
13
+ An Obsidian-style documentation system where AI agents and humans collaborate on creating and maintaining documentation.
14
 
15
+ ## ⚠️ Demo Mode
16
 
17
+ **This is a demonstration instance with ephemeral storage.**
 
 
 
 
 
18
 
19
+ - All data is temporary and resets on server restart
20
+ - Demo content is automatically seeded on each startup
21
+ - For production use, deploy your own instance with persistent storage
22
 
23
+ ## 🎯 Features
 
 
 
24
 
25
+ - **Wikilinks** - Link between notes using `[[Note Name]]` syntax
26
+ - **Full-Text Search** - BM25 ranking with recency bonus
27
+ - **Backlinks** - Automatically track note references
28
+ - **Split-Pane Editor** - Live markdown preview
29
+ - **MCP Integration** - AI agents can read/write via Model Context Protocol
30
+ - **Multi-Tenant** - Each user gets an isolated vault (HF OAuth)
31
 
32
+ ## πŸš€ Getting Started
33
 
34
+ 1. Click **"Sign in with Hugging Face"** to authenticate
35
+ 2. Browse the pre-seeded demo notes
36
+ 3. Try searching, creating, and editing notes
37
+ 4. Check out the wikilinks between documents
38
 
39
+ ## πŸ€– AI Agent Access (MCP)
40
 
41
+ After signing in, go to **Settings** to get your API token for MCP access:
42
 
43
  ```json
44
  {
45
  "mcpServers": {
46
  "obsidian-docs": {
47
+ "url": "https://YOUR_USERNAME-Document-MCP.hf.space/mcp",
48
+ "transport": "http",
49
+ "headers": {
50
+ "Authorization": "Bearer YOUR_JWT_TOKEN"
 
 
 
 
 
51
  }
52
  }
53
  }
54
  }
55
  ```
56
 
57
+ For local experiments you can still run the MCP server via STDIOβ€”use the "Local Development" snippet shown in Settings.
 
 
 
 
58
 
59
+ AI agents can then use these tools:
60
+ - `list_notes` - Browse vault
61
+ - `read_note` - Read note content
62
+ - `write_note` - Create/update notes
63
+ - `search_notes` - Full-text search
64
+ - `get_backlinks` - Find references
65
+ - `get_tags` - List all tags
66
 
67
+ ## πŸ—οΈ Tech Stack
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68
 
69
+ **Backend:**
70
+ - FastAPI - HTTP API server
71
+ - FastMCP - MCP server for AI integration
72
+ - SQLite FTS5 - Full-text search
73
+ - python-frontmatter - YAML metadata
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
 
75
+ **Frontend:**
76
+ - React + Vite - Modern web framework
77
+ - shadcn/ui - UI components
78
+ - Tailwind CSS - Styling
79
+ - react-markdown - Markdown rendering
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
 
81
+ ## πŸ“– Documentation
82
 
83
+ Key demo notes to explore:
 
 
 
84
 
85
+ - **Getting Started** - Introduction and overview
86
+ - **API Documentation** - REST API reference
87
+ - **MCP Integration** - AI agent configuration
88
+ - **Wikilink Examples** - How linking works
89
+ - **Architecture Overview** - System design
90
+ - **Search Features** - Full-text search details
91
 
92
+ ## βš™οΈ Deploy Your Own
 
 
 
 
 
 
 
 
93
 
94
+ Want persistent storage and full control? Deploy your own instance:
95
 
96
+ 1. Clone the repository
97
+ 2. Set up HF OAuth app
98
+ 3. Configure environment variables
99
+ 4. Deploy to HF Spaces or any Docker host
100
 
101
+ See [DEPLOYMENT.md](https://github.com/YOUR_REPO/Document-MCP/blob/main/DEPLOYMENT.md) for detailed instructions.
 
 
102
 
103
+ ## πŸ”’ Privacy & Data
 
 
104
 
105
+ - **Multi-tenant**: Each HF user gets an isolated vault
106
+ - **Demo data**: Resets on restart (ephemeral storage)
107
+ - **OAuth**: Secure authentication via Hugging Face
108
+ - **No tracking**: We don't collect analytics or personal data
109
 
110
  ## πŸ“ License
111
 
112
+ MIT License - See LICENSE file for details
113
 
114
  ## 🀝 Contributing
115
 
116
+ Contributions welcome! Open an issue or submit a PR.
117
 
118
+ ---
119
 
120
+ Built with ❀️ for the AI-human documentation collaboration workflow
121