bigwolfe commited on
Commit
90150ee
Β·
1 Parent(s): 786c2ca
specs/001-obsidian-docs-viewer/checklists/requirements.md ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Specification Quality Checklist: Multi-Tenant Obsidian-Like Docs Viewer
2
+
3
+ **Purpose**: Validate specification completeness and quality before proceeding to planning
4
+ **Created**: 2025-11-15
5
+ **Feature**: [spec.md](../spec.md)
6
+
7
+ ## Content Quality
8
+
9
+ - [x] No implementation details (languages, frameworks, APIs)
10
+ - [x] Focused on user value and business needs
11
+ - [x] Written for non-technical stakeholders
12
+ - [x] All mandatory sections completed
13
+
14
+ ## Requirement Completeness
15
+
16
+ - [x] No [NEEDS CLARIFICATION] markers remain
17
+ - [x] Requirements are testable and unambiguous
18
+ - [x] Success criteria are measurable
19
+ - [x] Success criteria are technology-agnostic (no implementation details)
20
+ - [x] All acceptance scenarios are defined
21
+ - [x] Edge cases are identified
22
+ - [x] Scope is clearly bounded
23
+ - [x] Dependencies and assumptions identified
24
+
25
+ ## Feature Readiness
26
+
27
+ - [x] All functional requirements have clear acceptance criteria
28
+ - [x] User scenarios cover primary flows
29
+ - [x] Feature meets measurable outcomes defined in Success Criteria
30
+ - [x] No implementation details leak into specification
31
+
32
+ ## Validation Notes
33
+
34
+ **Validation Date**: 2025-11-15
35
+
36
+ ### Content Quality Review
37
+ βœ… **PASS** - The specification maintains clear separation between WHAT (user needs) and HOW (implementation). While it references specific technologies (FastMCP, React, shadcn/ui, JWT) from the user's input, these are treated as constraints/assumptions rather than design decisions. The core requirements focus on user capabilities (authentication, vault isolation, search, wikilink resolution) rather than technical implementation.
38
+
39
+ ### Requirement Completeness Review
40
+ βœ… **PASS** - All requirements are:
41
+ - Testable: Each FR and SC can be verified through specific test scenarios
42
+ - Unambiguous: Clear language with specific behaviors (e.g., "409 Conflict", "case-insensitive normalized slug matching")
43
+ - Measurable: Success criteria include specific metrics (500ms, 2 seconds, 100% conflict detection)
44
+ - Technology-agnostic in outcomes: SCs focus on user-facing results (completion time, isolation guarantees)
45
+ - Comprehensive: 68 functional requirements, 14 success criteria, 10 edge cases, 5 prioritized user stories
46
+
47
+ ### Feature Readiness Review
48
+ βœ… **PASS** - The specification is implementation-ready:
49
+ - User stories are independently testable with clear acceptance scenarios
50
+ - P1 stories (AI write, human read) deliver standalone MVP value
51
+ - Edge cases cover security (path traversal), concurrency (version conflicts), limits (1 MiB, 5000 notes)
52
+ - Scope boundaries explicitly exclude features (aliases, mobile UI, real-time collab)
53
+ - Assumptions document deployment context and technical constraints
54
+
55
+ **Conclusion**: Specification passes all quality gates. Ready for `/speckit.plan` or `/speckit.clarify`.
specs/001-obsidian-docs-viewer/plan.md ADDED
@@ -0,0 +1,189 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Implementation Plan: Multi-Tenant Obsidian-Like Docs Viewer
2
+
3
+ **Branch**: `001-obsidian-docs-viewer` | **Date**: 2025-11-15 | **Spec**: [spec.md](./spec.md)
4
+ **Input**: Feature specification from `/specs/001-obsidian-docs-viewer/spec.md`
5
+
6
+ **Note**: This template is filled in by the `/speckit.plan` command. See `.specify/templates/commands/plan.md` for the execution workflow.
7
+
8
+ ## Summary
9
+
10
+ Build a multi-tenant Obsidian-like documentation viewer with AI-first workflow (AI writes via MCP, humans read/edit via web UI). System provides per-user vaults with Markdown notes, full-text search, wikilink resolution, tag indexing, and backlink tracking. Backend exposes FastMCP server (STDIO + HTTP transports) and HTTP API with Bearer auth (JWT). Frontend is React SPA with shadcn/ui, featuring directory tree navigation and split-pane editor. Deployment targets: local PoC (single-user, STDIO) and Hugging Face Space (multi-tenant, OAuth).
11
+
12
+ **Technical Approach**: Python backend with FastAPI + FastMCP, SQLite per-user indices, filesystem-based vault storage, JWT authentication. React + Vite frontend with react-markdown rendering. Incremental index updates on writes, optimistic concurrency for UI, last-write-wins for MCP.
13
+
14
+ ## Technical Context
15
+
16
+ **Language/Version**: Python 3.11+
17
+ **Primary Dependencies**: FastAPI, FastMCP, python-frontmatter, PyJWT, huggingface_hub, SQLite (stdlib)
18
+ **Storage**: Filesystem (per-user vault directories), SQLite (per-user indices)
19
+ **Testing**: pytest (backend unit/integration), Vitest (frontend unit), Playwright (E2E)
20
+ **Target Platform**: Linux server (HF Space), local dev (Windows/macOS/Linux)
21
+ **Project Type**: Web application (Python backend + React frontend)
22
+ **Performance Goals**:
23
+ - MCP operations: <500ms (read/write/search) for vaults with 1,000 notes
24
+ - UI rendering: <2s directory tree load, <1s note render, <1s search results
25
+ - Index rebuild: <30s for 1,000 notes
26
+ **Constraints**:
27
+ - 1 MiB max note size
28
+ - 5,000 notes max per vault
29
+ - 256 char max path length
30
+ - 100% tenant isolation (security requirement)
31
+ - 409 Conflict on concurrent edits (UI only)
32
+ **Scale/Scope**:
33
+ - MVP: 10 concurrent users (HF Space), 5,000 notes per user
34
+ - Local PoC: single user, unlimited notes (within filesystem limits)
35
+
36
+ ## Constitution Check
37
+
38
+ *GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
39
+
40
+ **Status**: No project constitution file found at `.specify/memory/constitution.md`. Constitution check skipped.
41
+
42
+ **Justification**: This is a new project without established architectural principles. Design decisions will be documented in research.md and can form the basis of a future constitution if needed.
43
+
44
+ ## Project Structure
45
+
46
+ ### Documentation (this feature)
47
+
48
+ ```text
49
+ specs/001-obsidian-docs-viewer/
50
+ β”œβ”€β”€ plan.md # This file (/speckit.plan command output)
51
+ β”œβ”€β”€ research.md # Phase 0 output (/speckit.plan command)
52
+ β”œβ”€β”€ data-model.md # Phase 1 output (/speckit.plan command)
53
+ β”œβ”€β”€ quickstart.md # Phase 1 output (/speckit.plan command)
54
+ β”œβ”€β”€ contracts/ # Phase 1 output (/speckit.plan command)
55
+ β”‚ β”œβ”€β”€ http-api.yaml # OpenAPI 3.1 spec for HTTP API
56
+ β”‚ └── mcp-tools.json # MCP tool schemas (JSON Schema)
57
+ └── tasks.md # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
58
+ ```
59
+
60
+ ### Source Code (repository root)
61
+
62
+ ```text
63
+ # Web application structure (Python backend + React frontend)
64
+
65
+ backend/
66
+ β”œβ”€β”€ src/
67
+ β”‚ β”œβ”€β”€ models/ # Pydantic models (Note, User, Index, Config)
68
+ β”‚ β”œβ”€β”€ services/
69
+ β”‚ β”‚ β”œβ”€β”€ vault.py # Filesystem vault operations
70
+ β”‚ β”‚ β”œβ”€β”€ indexer.py # Full-text search, tags, link graph
71
+ β”‚ β”‚ β”œβ”€β”€ auth.py # JWT + HF OAuth integration
72
+ β”‚ β”‚ └── config.py # Configuration management
73
+ β”‚ β”œβ”€β”€ api/
74
+ β”‚ β”‚ β”œβ”€β”€ main.py # FastAPI app + middleware
75
+ β”‚ β”‚ β”œβ”€β”€ routes/ # API endpoints (notes, search, auth, index)
76
+ β”‚ β”‚ └── middleware/ # Auth middleware, error handlers
77
+ β”‚ └── mcp/
78
+ β”‚ └── server.py # FastMCP server (STDIO + HTTP)
79
+ β”œβ”€β”€ tests/
80
+ β”‚ β”œβ”€β”€ unit/ # Service-level tests
81
+ β”‚ β”œβ”€β”€ integration/ # API + MCP integration tests
82
+ β”‚ └── contract/ # Contract tests for MCP tools + HTTP API
83
+ └── pyproject.toml # Python dependencies (Poetry/pip)
84
+
85
+ frontend/
86
+ β”œβ”€β”€ src/
87
+ β”‚ β”œβ”€β”€ components/
88
+ β”‚ β”‚ β”œβ”€β”€ ui/ # shadcn/ui components
89
+ β”‚ β”‚ β”œβ”€β”€ DirectoryTree.tsx
90
+ β”‚ β”‚ β”œβ”€β”€ NoteViewer.tsx
91
+ β”‚ β”‚ β”œβ”€β”€ NoteEditor.tsx
92
+ β”‚ β”‚ β”œβ”€β”€ SearchBar.tsx
93
+ β”‚ β”‚ └── AuthFlow.tsx
94
+ β”‚ β”œβ”€β”€ pages/
95
+ β”‚ β”‚ β”œβ”€β”€ App.tsx # Main app layout
96
+ β”‚ β”‚ β”œβ”€β”€ Login.tsx # HF OAuth landing
97
+ β”‚ β”‚ └── Settings.tsx # User profile + token management
98
+ β”‚ β”œβ”€β”€ services/
99
+ β”‚ β”‚ β”œβ”€β”€ api.ts # HTTP API client (fetch wrapper)
100
+ β”‚ β”‚ └── auth.ts # Token management, OAuth helpers
101
+ β”‚ β”œβ”€β”€ lib/
102
+ β”‚ β”‚ β”œβ”€β”€ wikilink.ts # Wikilink parsing + resolution
103
+ β”‚ β”‚ └── markdown.ts # react-markdown config
104
+ β”‚ └── types/ # TypeScript types (Note, User, SearchResult)
105
+ β”œβ”€β”€ tests/
106
+ β”‚ β”œβ”€β”€ unit/ # Component tests (Vitest + Testing Library)
107
+ β”‚ └── e2e/ # Playwright E2E tests
108
+ β”œβ”€β”€ package.json # Node dependencies
109
+ └── vite.config.ts # Vite build config
110
+
111
+ data/ # Runtime data (gitignored)
112
+ └── vaults/
113
+ └── <user_id>/ # Per-user vault directories
114
+
115
+ .env.example # Environment template (JWT_SECRET, HF OAuth, etc.)
116
+ README.md # Setup instructions, MCP client config examples
117
+ ```
118
+
119
+ **Structure Decision**: Web application structure selected based on spec requirements for Python backend (FastAPI + FastMCP) and React frontend (shadcn/ui). Backend and frontend are separate codebases to support independent development/testing cycles, with backend serving frontend as static files in production (HF Space). Data directory is runtime-only (vaults + SQLite indices), not version controlled.
120
+
121
+ ## Complexity Tracking
122
+
123
+ > **No constitution violations to track** (no project constitution exists yet)
124
+
125
+ ## Phase 0: Research & Technical Decisions
126
+
127
+ **Status**: Research required for technology integration patterns and best practices.
128
+
129
+ **Research Topics**:
130
+ 1. FastMCP HTTP transport authentication patterns (Bearer token validation)
131
+ 2. Hugging Face Space OAuth integration best practices (attach/parse helpers)
132
+ 3. SQLite schema design for per-user multi-index storage (full-text + tags + links)
133
+ 4. Wikilink normalization and resolution algorithms (slug matching, ambiguity handling)
134
+ 5. React + shadcn/ui directory tree component patterns (collapsible, virtualization)
135
+ 6. Optimistic concurrency implementation patterns (ETags vs version counters)
136
+ 7. Markdown frontmatter parsing with fallback strategies (malformed YAML handling)
137
+ 8. JWT token management in React (localStorage vs memory, refresh strategies)
138
+
139
+ **Output**: See `research.md` for detailed findings and decisions.
140
+
141
+ ## Phase 1: Data Model & Contracts
142
+
143
+ **Prerequisites**: `research.md` complete
144
+
145
+ ### Data Model
146
+
147
+ **Entities** (see `data-model.md` for full schemas):
148
+
149
+ 1. **User**: `user_id`, `hf_profile` (optional), `vault_path`, `created_at`
150
+ 2. **Note**: `path`, `title`, `metadata`, `body`, `version`, `created`, `updated`
151
+ 3. **Wikilink**: `source_path`, `link_text`, `target_path` (nullable), `is_resolved`
152
+ 4. **Tag**: `tag_name`, `note_paths[]`
153
+ 5. **Index**: `user_id`, `note_count`, `last_full_rebuild`, `last_incremental_update`
154
+ 6. **Token**: `jwt` (claims: `sub`, `exp`, `iat`)
155
+
156
+ ### API Contracts
157
+
158
+ **HTTP API** (see `contracts/http-api.yaml`):
159
+ - Authentication: `POST /api/tokens`, `GET /api/me`
160
+ - Notes CRUD: `GET /api/notes`, `GET /api/notes/{path}`, `PUT /api/notes/{path}`, `DELETE /api/notes/{path}`
161
+ - Search: `GET /api/search?q=<query>`
162
+ - Navigation: `GET /api/backlinks/{path}`, `GET /api/tags`
163
+ - Index: `GET /api/index/health`, `POST /api/index/rebuild`
164
+
165
+ **MCP Tools** (see `contracts/mcp-tools.json`):
166
+ - `list_notes`: `{folder?: string}` β†’ `[{path, title, last_modified}]`
167
+ - `read_note`: `{path: string}` β†’ `{path, title, metadata, body}`
168
+ - `write_note`: `{path, title?, metadata?, body}` β†’ `{status, path}`
169
+ - `delete_note`: `{path: string}` β†’ `{status}`
170
+ - `search_notes`: `{query: string}` β†’ `[{path, title, snippet}]`
171
+ - `get_backlinks`: `{path: string}` β†’ `[{path, title}]`
172
+ - `get_tags`: `{}` β†’ `[{tag, count}]`
173
+
174
+ ### Quickstart
175
+
176
+ **Output**: See `quickstart.md` for:
177
+ - Local development setup (Python venv, Node install, env config)
178
+ - Running backend (STDIO MCP + HTTP API)
179
+ - Running frontend (Vite dev server)
180
+ - MCP client configuration (Claude Code STDIO example)
181
+ - Testing workflows (unit, integration, E2E)
182
+
183
+ ## Phase 2: Task Generation
184
+
185
+ **Not included in this command** - run `/speckit.tasks` to generate dependency-ordered implementation tasks based on this plan and the data model.
186
+
187
+ ---
188
+
189
+ **Plan Status**: Phase 0 and Phase 1 execution in progress below...
specs/001-obsidian-docs-viewer/spec.md ADDED
@@ -0,0 +1,282 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Feature Specification: Multi-Tenant Obsidian-Like Docs Viewer
2
+
3
+ **Feature Branch**: `001-obsidian-docs-viewer`
4
+ **Created**: 2025-11-15
5
+ **Status**: Draft
6
+ **Input**: User description: "Build a multi-tenant Obsidian-like docs viewer with FastMCP server (Python) exposing tools over MCP, HTTP API with Bearer auth for UI and MCP, multi-tenant vaults (per-user Markdown directories), indexing (full-text search, backlinks, tags), and a React + shadcn/ui frontend hosted in a Hugging Face Space with Obsidian-style UI (left: directory pane with vault explorer + search, right: main note pane with live-rendered Markdown and light editing). Primary workflow: AI (via MCP) writes/updates docs, humans read + occasionally tweak in the UI."
7
+
8
+ ## User Scenarios & Testing *(mandatory)*
9
+
10
+ ### User Story 1 - AI Agent Writes and Updates Documentation (Priority: P1)
11
+
12
+ An AI agent (Claude via MCP) needs to create and maintain structured documentation within a user's vault. The agent discovers existing notes, creates new notes with proper frontmatter and wikilinks, updates existing content, and automatically maintains the index for searchability.
13
+
14
+ **Why this priority**: This is the primary workflow. Without AI write capability via MCP, the core value proposition doesn't exist. This enables the "AI writes, humans read" paradigm.
15
+
16
+ **Independent Test**: Can be fully tested by configuring an MCP client (Claude Code/Desktop) with STDIO transport, issuing `write_note` commands, and verifying files are created with correct frontmatter, body content, and automatic index updates. Delivers immediate value for AI-driven documentation workflows without requiring the UI.
17
+
18
+ **Acceptance Scenarios**:
19
+
20
+ 1. **Given** an MCP client connected via STDIO transport, **When** the agent calls `write_note` with path "api/design.md", title "API Design", and markdown body, **Then** the file is created with YAML frontmatter containing title, created/updated timestamps, and the body content
21
+ 2. **Given** a note already exists at "api/design.md" with version 3, **When** the agent calls `write_note` with updated content, **Then** the note is updated, version increments to 4, updated timestamp is set to now, and the full-text index is updated
22
+ 3. **Given** a note containing wikilinks `[[Authentication Flow]]` and `[[Database Schema]]`, **When** the note is written, **Then** the link graph index records outgoing links and updates backlinks for target notes
23
+ 4. **Given** a note with frontmatter tags `[backend, api]`, **When** the note is written, **Then** the tag index is updated to include this note under both tags
24
+ 5. **Given** an MCP client requests `search_notes` with query "authentication", **When** notes containing "authentication" in title or body exist, **Then** results are returned ranked by title matches first, then body matches, with recency bonus
25
+
26
+ ---
27
+
28
+ ### User Story 2 - Human Reads Documentation in Web UI (Priority: P1)
29
+
30
+ A human user logs into the web UI, browses their vault's directory tree, searches for notes, and reads rendered Markdown with working wikilinks and backlinks visible.
31
+
32
+ **Why this priority**: The read-first UI is the primary human interaction mode. Without this, humans can't consume the AI-generated documentation effectively. This is equally critical to P1 as the write capability.
33
+
34
+ **Independent Test**: Can be fully tested by opening the web UI, authenticating with a static token (local mode), clicking through directory tree items, and verifying rendered Markdown displays correctly with clickable wikilinks. Delivers value for documentation consumption without requiring MCP or editing features.
35
+
36
+ **Acceptance Scenarios**:
37
+
38
+ 1. **Given** a user is on the login page in local mode, **When** they access the UI with a valid static Bearer token in local storage, **Then** they see the directory pane with all vault notes organized by folder structure
39
+ 2. **Given** the directory pane shows nested folders, **When** the user clicks on a note "api/design.md", **Then** the main pane displays the rendered Markdown with title "API Design", body content with proper formatting, and metadata footer showing tags and timestamps
40
+ 3. **Given** a rendered note contains a wikilink `[[Authentication Flow]]`, **When** the user clicks the link, **Then** the UI navigates to and renders "authentication-flow.md" (resolved via normalized slug matching)
41
+ 4. **Given** a note "auth.md" is referenced by 3 other notes, **When** the note is displayed, **Then** the backlinks section in the footer shows all 3 referring notes as clickable links
42
+ 5. **Given** the user types "database" into the search bar, **When** the search debounces and executes, **Then** matching notes appear in a dropdown with snippets, ranked by title matches (3x weight) then body matches, with recency bonus
43
+
44
+ ---
45
+
46
+ ### User Story 3 - Human Edits Documentation in Web UI (Priority: P2)
47
+
48
+ A human user needs to make minor corrections or additions to AI-generated documentation. They click "Edit" on a note, see a split view with Markdown editor on the left and live preview on the right, make changes, and save with optimistic concurrency protection.
49
+
50
+ **Why this priority**: Enables human refinement of AI content. Important for quality but secondary to read/write workflows. Users can still achieve primary value (AI writes, humans read) without editing.
51
+
52
+ **Independent Test**: Can be tested by opening a note, clicking "Edit", modifying content in the textarea, clicking "Save", and verifying the file is updated with version conflict detection. Delivers value for collaborative human-AI documentation without requiring full MCP or advanced features.
53
+
54
+ **Acceptance Scenarios**:
55
+
56
+ 1. **Given** a user is viewing a rendered note with version 5, **When** they click the "Edit" button, **Then** the main pane switches to split view: left side shows markdown source in a textarea, right side shows live-rendered preview
57
+ 2. **Given** the user is in edit mode, **When** they modify the markdown body and click "Save", **Then** a PUT request is sent with `if_version: 5`, the note is updated to version 6, updated timestamp is set to now, and the UI switches back to read mode
58
+ 3. **Given** the user opened a note with version 5 and another user/agent updated it to version 6, **When** the first user clicks "Save", **Then** the server returns 409 Conflict and the UI displays "This note changed since you opened it; please reload before saving"
59
+ 4. **Given** the user edits the note title in frontmatter from "API Design" to "API Architecture", **When** they save, **Then** the file is updated with new title, directory tree reflects the change, and the title-based link resolution index is updated
60
+ 5. **Given** the user adds a new wikilink `[[New Feature]]` that doesn't exist, **When** they save and view the rendered note, **Then** the wikilink is rendered as a "broken link" style with a "Create note" affordance
61
+
62
+ ---
63
+
64
+ ### User Story 4 - Multi-Tenant Access via Hugging Face OAuth (Priority: P2)
65
+
66
+ Multiple users can sign in to the Hugging Face Space using "Sign in with HF", each getting isolated vaults, personalized API tokens for MCP access, and per-user indices.
67
+
68
+ **Why this priority**: Enables the production deployment model. Critical for multi-user scenarios but not needed for local PoC or single-user hackathon demos.
69
+
70
+ **Independent Test**: Can be tested by deploying to HF Space with OAuth enabled, signing in with two different HF accounts, creating notes in each vault, and verifying complete data isolation. Delivers value for hosted multi-tenant scenarios without requiring advanced features.
71
+
72
+ **Acceptance Scenarios**:
73
+
74
+ 1. **Given** a user visits the HF Space and is not authenticated, **When** they land on the app, **Then** they see a "Sign in with Hugging Face" button
75
+ 2. **Given** a user clicks "Sign in with Hugging Face", **When** HF OAuth flow completes successfully, **Then** the backend maps their HF username to an internal user_id, creates a vault directory at `/data/vaults/<user_id>/`, initializes an empty index, and redirects to the main app UI
76
+ 3. **Given** a user is authenticated via HF OAuth, **When** they call `POST /api/tokens`, **Then** the server issues a JWT with `sub=user_id` and `exp=now+90days`, returning `{"token": "<jwt>"}`
77
+ 4. **Given** an MCP client configures the HTTP transport with `Authorization: Bearer <jwt>`, **When** the client calls `list_notes`, **Then** the server validates the JWT, extracts user_id, and returns notes only from that user's vault
78
+ 5. **Given** two users (Alice and Bob) each create notes in their vaults, **When** Alice searches for notes, **Then** she sees only her own notes, never Bob's (complete data isolation)
79
+
80
+ ---
81
+
82
+ ### User Story 5 - Full-Text Search with Index Health Monitoring (Priority: P3)
83
+
84
+ Users and AI agents can search across all notes using full-text queries, with results ranked by relevance (title matches weighted higher, recency bonus). Users can manually trigger index rebuilds if needed and view index health status.
85
+
86
+ **Why this priority**: Enhances discoverability but is a supporting feature. The basic search in P1/P2 stories is sufficient for MVP. Index rebuild is primarily a maintenance/troubleshooting tool.
87
+
88
+ **Independent Test**: Can be tested by creating notes with specific keywords, calling `search_notes` MCP tool or `/api/search` endpoint, and verifying ranking (title 3x weight, body 1x, recency bonus). Can verify rebuild by calling `POST /api/index/rebuild` and checking updated counts. Delivers value for large vaults without requiring other features.
89
+
90
+ **Acceptance Scenarios**:
91
+
92
+ 1. **Given** a vault with 50 notes, 10 containing "authentication" in body, and 2 containing "authentication" in title, **When** a search query "authentication" is executed, **Then** the 2 title-match notes are ranked first, followed by body-match notes, with notes updated in last 7 days receiving a +1.0 recency bonus
93
+ 2. **Given** a note contains tokens in both title and body matching the query, **When** the search is executed, **Then** the score is `(3 * title_hits) + (1 * body_hits) + recency_bonus` and results are sorted by descending score
94
+ 3. **Given** a user calls `GET /api/index/health`, **When** the index exists, **Then** the response includes `note_count`, `last_full_rebuild` timestamp, and `last_incremental_update` timestamp
95
+ 4. **Given** a user has made many manual file changes outside the app, **When** they call `POST /api/index/rebuild`, **Then** the server drops existing index rows for their user_id, re-scans all .md files, rebuilds full-text index, tag index, and link graph, and updates `last_full_rebuild` timestamp
96
+ 5. **Given** the index shows `note_count: 100` and `last_incremental_update` is 2 minutes ago, **When** a new note is written via MCP, **Then** `note_count` increments to 101, `last_incremental_update` is set to now, and the new note is immediately searchable
97
+
98
+ ---
99
+
100
+ ### Edge Cases
101
+
102
+ - **Wikilink ambiguity**: When `[[Note Name]]` matches multiple files (e.g., `docs/setup.md` and `guides/setup.md`), the system resolves deterministically by preferring same-folder match first, then lexicographically smallest path. Ambiguous links may be flagged in backlinks view.
103
+ - **Concurrent edits**: When Claude (via MCP, last-write-wins) and a human (via UI, optimistic concurrency) edit the same note simultaneously, the human's save will fail with 409 Conflict if the version changed, preventing silent data loss from the human perspective.
104
+ - **Broken wikilinks**: When a note contains `[[Non Existent Note]]`, it is rendered as a visually distinct "broken link" style. The index tracks unresolved links. UI offers "Create note" affordance on click.
105
+ - **Large note uploads**: When a note exceeds 1 MiB UTF-8 text, the server returns `413 Payload Too Large` with a clear error message.
106
+ - **Vault limit exceeded**: When a user attempts to create a note that would exceed 5,000 notes in their vault, the server returns `403 Forbidden` with error code "vault_note_limit_exceeded".
107
+ - **Malformed frontmatter**: When a note has invalid YAML frontmatter, the system treats it as a note without frontmatter, using the first `# Heading` as title or filename stem as fallback.
108
+ - **Path traversal attempts**: When a path contains `..` or absolute path components, the vault module normalizes and validates against the user's vault root, rejecting any escape attempts with `400 Bad Request`.
109
+ - **Token expiration**: When a JWT expires (after 90 days), API/MCP requests return `401 Unauthorized`. User must re-authenticate and issue a new token via `POST /api/tokens`.
110
+ - **Case-insensitive wikilink resolution**: When `[[api design]]` and `[[API Design]]` both exist as notes, resolution uses case-insensitive normalized slug matching, with exact case matches preferred, then any case variation.
111
+ - **Empty search query**: When search is called with an empty or whitespace-only query, the API returns an empty result set without error.
112
+
113
+ ## Requirements *(mandatory)*
114
+
115
+ ### Functional Requirements
116
+
117
+ #### Core Vault Operations
118
+
119
+ - **FR-001**: System MUST provide isolated vault directories per user under a configurable base path (e.g., `/data/vaults/<user_id>/`)
120
+ - **FR-002**: System MUST support arbitrary nested folder structures within each vault, containing Markdown (.md) files
121
+ - **FR-003**: System MUST enforce path normalization and validation to prevent directory traversal attacks (no `..` escapes, all paths relative to vault root)
122
+ - **FR-004**: System MUST parse Markdown files with optional YAML frontmatter containing metadata fields (title, tags, created, updated, project, etc.)
123
+ - **FR-005**: System MUST use `python-frontmatter` library (or equivalent) to load and serialize frontmatter + body
124
+ - **FR-006**: System MUST auto-manage `created` timestamp (set once on creation if not provided) and `updated` timestamp (always set to now on writes)
125
+ - **FR-007**: System MUST reject notes exceeding 1 MiB UTF-8 text with `413 Payload Too Large`
126
+ - **FR-008**: System MUST reject vault operations that would exceed 5,000 notes per user with `403 Forbidden` and error code "vault_note_limit_exceeded"
127
+ - **FR-009**: System MUST limit relative path strings to 256 characters maximum
128
+
129
+ #### Indexing and Search
130
+
131
+ - **FR-010**: System MUST maintain per-user indices for: (a) full-text search (token β†’ note paths), (b) tag index (tag β†’ note paths), (c) link graph (note β†’ outgoing wikilinks, note β†’ backlinks)
132
+ - **FR-011**: System MUST store indices in SQLite database with per-user isolation
133
+ - **FR-012**: System MUST support full-text search with simple tokenization (split on non-alphanumeric, case-insensitive)
134
+ - **FR-013**: System MUST rank search results using scoring formula: `(3 * title_hits) + (1 * body_hits) + recency_bonus`, where recency_bonus is 1.0 for updates in last 7 days, 0.5 for last 30 days, 0 otherwise
135
+ - **FR-014**: System MUST extract wikilinks from note bodies using regex pattern `\[\[([^\]]+)\]\]`
136
+ - **FR-015**: System MUST resolve wikilinks via case-insensitive normalized slug matching: normalize(link_text) matches normalize(filename_stem) or normalize(frontmatter_title)
137
+ - **FR-016**: System MUST handle ambiguous wikilinks deterministically: prefer same-folder match, then lexicographically smallest full path
138
+ - **FR-017**: System MUST track unresolved wikilinks (links with no matching note) in the index for UI display
139
+ - **FR-018**: System MUST update indices incrementally on every write/delete operation (synchronous, blocking)
140
+ - **FR-019**: System MUST provide manual full index rebuild capability that re-scans all vault files and reconstructs indices from scratch
141
+
142
+ #### Versioning and Concurrency
143
+
144
+ - **FR-020**: System MUST maintain a version counter (integer) per note, stored in the index (not in frontmatter)
145
+ - **FR-021**: System MUST increment version by 1 on every successful write operation
146
+ - **FR-022**: System MUST support optimistic concurrency for HTTP API writes: if `if_version` parameter is provided and does not match current version, return `409 Conflict`
147
+ - **FR-023**: System MUST implement last-write-wins for MCP tool writes (no version checking)
148
+
149
+ #### Authentication and Authorization
150
+
151
+ - **FR-024**: System MUST support two authentication modes: (a) Local mode with static user_id "local-dev" and optional static Bearer token, (b) HF Space mode with OAuth-based per-user identity
152
+ - **FR-025**: System MUST use JWT tokens for API and MCP HTTP authentication, containing claims: `sub=user_id`, `exp=now+90days`, signed with configurable secret
153
+ - **FR-026**: System MUST validate Bearer tokens via `Authorization: Bearer <token>` header on all protected endpoints
154
+ - **FR-027**: System MUST extract user_id from validated JWT and scope all vault/index operations to that user
155
+ - **FR-028**: System MUST integrate with Hugging Face OAuth in Space mode, using `huggingface_hub.attach_huggingface_oauth` and `parse_huggingface_oauth` helpers
156
+ - **FR-029**: System MUST map HF OAuth identity (username or ID) to internal user_id
157
+ - **FR-030**: System MUST create vault directory and initialize empty index on first login for new HF users
158
+
159
+ #### HTTP API
160
+
161
+ - **FR-031**: System MUST expose HTTP API using FastAPI (or equivalent) with JSON request/response format
162
+ - **FR-032**: System MUST provide endpoint `GET /api/me` returning user info (`user_id`, HF profile if applicable, authentication status)
163
+ - **FR-033**: System MUST provide endpoint `POST /api/tokens` to issue new JWT tokens for authenticated users
164
+ - **FR-034**: System MUST provide endpoint `GET /api/notes` to list notes with optional folder filtering, returning array of `{path, title, last_modified}`
165
+ - **FR-035**: System MUST provide endpoint `GET /api/notes/{path}` (where path is URL-encoded, includes `.md`) returning full note: `{path, title, metadata, body, version, created, updated}`
166
+ - **FR-036**: System MUST provide endpoint `PUT /api/notes/{path}` accepting `{title, metadata, body, if_version?}` to create/update notes
167
+ - **FR-037**: System MUST provide endpoint `DELETE /api/notes/{path}` to delete notes
168
+ - **FR-038**: System MUST provide endpoint `GET /api/search?q=<query>` returning ranked search results with snippets
169
+ - **FR-039**: System MUST provide endpoint `GET /api/backlinks/{path}` returning array of notes that reference the target note
170
+ - **FR-040**: System MUST provide endpoint `GET /api/tags` returning list of `{tag, count}` across all user notes
171
+ - **FR-041**: System MUST provide endpoint `GET /api/index/health` returning `{note_count, last_full_rebuild, last_incremental_update}`
172
+ - **FR-042**: System MUST provide endpoint `POST /api/index/rebuild` to trigger manual full index rebuild
173
+
174
+ #### MCP Server (FastMCP)
175
+
176
+ - **FR-043**: System MUST expose MCP server using FastMCP library with two transport modes: (a) STDIO for local development, (b) HTTP for remote/HF Space access
177
+ - **FR-044**: System MUST configure MCP HTTP transport to require `Authorization: Bearer <token>` header and validate JWT
178
+ - **FR-045**: System MUST provide MCP tool `list_notes` with input `{folder?: string}` returning `[{path, title, last_modified}]`
179
+ - **FR-046**: System MUST provide MCP tool `read_note` with input `{path: string}` returning `{path, title, metadata, body}`
180
+ - **FR-047**: System MUST provide MCP tool `write_note` with input `{path: string, title?: string, metadata?: object, body: string}` returning `{status: "ok", path}`
181
+ - **FR-048**: System MUST provide MCP tool `delete_note` with input `{path: string}` returning `{status: "ok"}`
182
+ - **FR-049**: System MUST provide MCP tool `search_notes` with input `{query: string}` returning `[{path, title, snippet}]`
183
+ - **FR-050**: System MUST provide MCP tool `get_backlinks` with input `{path: string}` returning `[{path, title}]`
184
+ - **FR-051**: System MUST provide MCP tool `get_tags` with input `{}` returning `[{tag, count}]`
185
+ - **FR-052**: System MUST define all MCP tool inputs/outputs with JSON Schema using FastMCP/Pydantic models
186
+
187
+ #### Frontend (React + shadcn/ui)
188
+
189
+ - **FR-053**: System MUST provide a single-page React application built with Vite or Next.js, using shadcn/ui components
190
+ - **FR-054**: System MUST implement Obsidian-style layout: left sidebar (directory pane + search), right main pane (note viewer/editor)
191
+ - **FR-055**: System MUST display directory tree in left sidebar with collapsible folders and note leaf items, using shadcn `ScrollArea`
192
+ - **FR-056**: System MUST provide search input in left sidebar with debounced queries calling `GET /api/search`, displaying results in dropdown
193
+ - **FR-057**: System MUST render selected note in main pane as read-only Markdown by default using `react-markdown` with plugins for code blocks and links
194
+ - **FR-058**: System MUST display note metadata in footer: tags (chips), created/updated timestamps, backlinks (clickable)
195
+ - **FR-059**: System MUST provide "Edit" button to switch main pane to edit mode: left side textarea with markdown source, right side live preview
196
+ - **FR-060**: System MUST provide "Save" button in edit mode that calls `PUT /api/notes/{path}` with `if_version`, handling 409 Conflict by showing "Note changed, please reload" message
197
+ - **FR-061**: System MUST render wikilinks as clickable links, resolving to target notes on click
198
+ - **FR-062**: System MUST render unresolved wikilinks as distinct "broken link" style with "Create note" affordance
199
+ - **FR-063**: System MUST provide "New note" button in left sidebar to create new notes with auto-generated frontmatter template
200
+ - **FR-064**: System MUST implement authentication flow for HF Space mode: landing page with "Sign in with Hugging Face" button, redirecting to main app after OAuth callback
201
+ - **FR-065**: System MUST call `GET /api/me` on startup to detect authentication status and `POST /api/tokens` to obtain Bearer token for API calls
202
+ - **FR-066**: System MUST display user profile/settings view showing user_id and API token(s) with "copy" button for MCP configuration
203
+ - **FR-067**: System MUST store Bearer token in memory or localStorage and include in `Authorization` header for all API requests
204
+ - **FR-068**: System MUST display small index health indicator showing note count and last updated timestamp
205
+
206
+ ### Key Entities
207
+
208
+ - **User**: Represents an authenticated user (local-dev or HF OAuth identity). Has a unique `user_id`, optional HF profile data (username, avatar), and vault directory path.
209
+
210
+ - **Vault**: A user-specific directory tree containing Markdown notes. Has a root path (`/data/vaults/<user_id>/`), arbitrary nested folders, and .md files.
211
+
212
+ - **Note**: A Markdown file with optional YAML frontmatter and body content. Key attributes: `path` (relative to vault root, includes .md), `title` (from frontmatter or first H1 or filename stem), `metadata` (frontmatter key-value pairs), `body` (markdown content), `version` (integer counter), `created` (ISO timestamp), `updated` (ISO timestamp).
213
+
214
+ - **Wikilink**: A reference from one note to another using `[[link text]]` syntax. Has `source_note_path`, `link_text`, `target_note_path` (resolved via normalized slug matching, may be null if unresolved).
215
+
216
+ - **Tag**: A metadata label applied to notes via frontmatter `tags: [tag1, tag2]`. Has `tag_name` and count of associated notes.
217
+
218
+ - **Index**: Per-user data structures for efficient search and navigation. Contains: full-text inverted index (token β†’ note paths), tag index (tag β†’ note paths), link graph (note β†’ outgoing wikilinks, note β†’ backlinks).
219
+
220
+ - **Token (JWT)**: A signed JSON Web Token used for API and MCP authentication. Contains claims: `sub` (user_id), `exp` (expiration timestamp), signed with server secret.
221
+
222
+ ## Success Criteria *(mandatory)*
223
+
224
+ ### Measurable Outcomes
225
+
226
+ - **SC-001**: AI agents can create, read, update, and delete notes via MCP STDIO transport in under 500ms per operation for vaults with up to 1,000 notes
227
+ - **SC-002**: Human users can browse directory tree, select a note, and view rendered Markdown in under 2 seconds from click to full render
228
+ - **SC-003**: Search queries return ranked results in under 1 second for vaults with up to 5,000 notes
229
+ - **SC-004**: Wikilink navigation (click to target note) completes in under 1 second, with ambiguous links resolved deterministically without user intervention
230
+ - **SC-005**: Concurrent edit conflicts (human vs AI) are detected and prevented 100% of the time via optimistic concurrency for UI writes
231
+ - **SC-006**: Index health accurately reflects vault state: `note_count` matches actual file count within 1 second of any write operation (incremental update)
232
+ - **SC-007**: Multi-tenant isolation is complete: users never see or access other users' vaults or notes in API, MCP, or UI
233
+ - **SC-008**: OAuth authentication flow (HF Space mode) completes in under 10 seconds from "Sign in" click to main app view
234
+ - **SC-009**: Users can issue API tokens and configure MCP clients with documented steps, successfully making MCP HTTP requests within 5 minutes of first login
235
+ - **SC-010**: Manual index rebuild completes in under 30 seconds for vaults with up to 1,000 notes
236
+ - **SC-011**: System handles 10 concurrent users (each making read/write/search operations) without response time degradation beyond 2x baseline
237
+ - **SC-012**: Broken wikilinks are visually distinct and offer "Create note" affordance, with 90% of test users successfully creating target notes on first attempt
238
+ - **SC-013**: Note edit-save cycle with version conflict detection prevents silent overwrites in 100% of conflict scenarios
239
+ - **SC-014**: All API endpoints return appropriate HTTP status codes (200, 201, 400, 401, 403, 409, 413, 500) with clear error messages in JSON format
240
+
241
+ ## Assumptions
242
+
243
+ 1. **Deployment target**: Primary deployment is Hugging Face Space with OAuth. Local PoC uses static token or no auth.
244
+ 2. **Note format**: All notes are UTF-8 encoded Markdown with optional YAML frontmatter. No binary files or non-.md formats in vaults.
245
+ 3. **Wikilink syntax**: Only `[[wikilink]]` syntax is supported (Obsidian-style). No `[[link|alias]]` or other variants in initial version.
246
+ 4. **Search sophistication**: Simple tokenization (split on non-alphanum) is sufficient. No stemming, synonyms, or advanced NLP.
247
+ 5. **Concurrency model**: SQLite is acceptable for per-user indices given hackathon scale. Future versions may need distributed DB for higher concurrency.
248
+ 6. **Frontend deployment**: Frontend is served from the same Python process as backend (static files or integrated framework), not a separate service.
249
+ 7. **MCP client types**: Primary MCP clients are Claude Code, Claude Desktop (STDIO), and potentially other MCP-compatible tools via HTTP transport.
250
+ 8. **Storage**: Filesystem-based vault storage is sufficient. No object storage or cloud sync in initial version.
251
+ 9. **Performance targets**: Designed for individual user vaults (hundreds to low thousands of notes), not enterprise-scale knowledge bases.
252
+ 10. **Security**: HTTPS is assumed for HF Space deployment. JWT secret management follows standard env-var configuration practices.
253
+
254
+ ## Scope Boundaries
255
+
256
+ ### In Scope
257
+
258
+ - Multi-tenant vault storage with per-user directories
259
+ - Full-text search, tag index, and bidirectional link graph
260
+ - Wikilink resolution with ambiguity handling and broken link detection
261
+ - HTTP API with Bearer auth (JWT)
262
+ - FastMCP server with STDIO (local) and HTTP (remote) transports
263
+ - React + shadcn/ui frontend with Obsidian-style layout
264
+ - Read-first UI with secondary editing capability
265
+ - Optimistic concurrency for UI, last-write-wins for MCP
266
+ - HF OAuth integration for multi-user Space deployment
267
+ - Manual index rebuild and health monitoring
268
+
269
+ ### Out of Scope
270
+
271
+ - AI-driven re-organization or "smart" refactors of documentation structure
272
+ - Complex agentic flows or autonomous planning by AI
273
+ - Wikilink aliases (`[[link|display text]]`)
274
+ - Advanced Markdown features (footnotes, math rendering, diagrams) beyond basic code/link support
275
+ - Real-time collaborative editing (operational transforms or CRDTs)
276
+ - Version history or rollback beyond current version conflict detection
277
+ - Export to non-Markdown formats (PDF, HTML, etc.)
278
+ - Import from other note systems (Notion, Evernote, etc.)
279
+ - Mobile-optimized UI (desktop-first only)
280
+ - Offline support or PWA capabilities
281
+ - Fine-grained RBAC or multi-user permissions within a single vault (each user has their own isolated vault)
282
+ - Auto-save or draft states in UI editor