bigwolfe commited on
Commit
05c9551
Β·
1 Parent(s): 2e98ac7
CLAUDE.md CHANGED
@@ -271,3 +271,10 @@ Current active feature: `001-obsidian-docs-viewer`
271
  ```
272
 
273
  Obtain JWT: `POST /api/tokens` after HF OAuth login.
 
 
 
 
 
 
 
 
271
  ```
272
 
273
  Obtain JWT: `POST /api/tokens` after HF OAuth login.
274
+
275
+ ## Active Technologies
276
+ - Python 3.11+ (backend), TypeScript (frontend) + FastAPI, LlamaIndex, llama-index-llms-google-genai, llama-index-embeddings-google-genai, React 18+, Tailwind CSS, Shadcn/UI (004-gemini-vault-chat)
277
+ - Filesystem vault (existing), LlamaIndex persisted vector store (new, under `data/llamaindex/`) (004-gemini-vault-chat)
278
+
279
+ ## Recent Changes
280
+ - 004-gemini-vault-chat: Added Python 3.11+ (backend), TypeScript (frontend) + FastAPI, LlamaIndex, llama-index-llms-google-genai, llama-index-embeddings-google-genai, React 18+, Tailwind CSS, Shadcn/UI
specs/004-gemini-vault-chat/checklists/requirements.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Specification Quality Checklist: Gemini Vault Chat Agent
2
+
3
+ **Purpose**: Validate specification completeness and quality before proceeding to planning
4
+ **Created**: 2025-11-28
5
+ **Feature**: [spec.md](../spec.md)
6
+
7
+ ## Content Quality
8
+
9
+ - [x] No implementation details (languages, frameworks, APIs)
10
+ - [x] Focused on user value and business needs
11
+ - [x] Written for non-technical stakeholders
12
+ - [x] All mandatory sections completed
13
+
14
+ ## Requirement Completeness
15
+
16
+ - [x] No [NEEDS CLARIFICATION] markers remain
17
+ - [x] Requirements are testable and unambiguous
18
+ - [x] Success criteria are measurable
19
+ - [x] Success criteria are technology-agnostic (no implementation details)
20
+ - [x] All acceptance scenarios are defined
21
+ - [x] Edge cases are identified
22
+ - [x] Scope is clearly bounded
23
+ - [x] Dependencies and assumptions identified
24
+
25
+ ## Feature Readiness
26
+
27
+ - [x] All functional requirements have clear acceptance criteria
28
+ - [x] User scenarios cover primary flows
29
+ - [x] Feature meets measurable outcomes defined in Success Criteria
30
+ - [x] No implementation details leak into specification
31
+
32
+ ## Notes
33
+
34
+ - All validation items passed on first review
35
+ - Spec is ready for `/speckit.plan` phase
36
+ - The feature input was detailed from existing planning documents, enabling a complete spec without clarifications
37
+
specs/004-gemini-vault-chat/contracts/rag-api.yaml ADDED
@@ -0,0 +1,219 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ openapi: 3.0.3
2
+ info:
3
+ title: Document-MCP RAG Chat API
4
+ description: Gemini-powered RAG chat endpoints for vault content querying
5
+ version: 1.0.0
6
+
7
+ paths:
8
+ /api/rag/chat:
9
+ post:
10
+ summary: Send a message and get AI-generated response
11
+ description: |
12
+ Accepts conversation history and returns an AI-synthesized answer
13
+ based on vault content, along with source references.
14
+ operationId: ragChat
15
+ tags:
16
+ - RAG
17
+ requestBody:
18
+ required: true
19
+ content:
20
+ application/json:
21
+ schema:
22
+ $ref: '#/components/schemas/ChatRequest'
23
+ example:
24
+ messages:
25
+ - role: user
26
+ content: "How does authentication work in this project?"
27
+ responses:
28
+ '200':
29
+ description: Successful response with answer and sources
30
+ content:
31
+ application/json:
32
+ schema:
33
+ $ref: '#/components/schemas/ChatResponse'
34
+ example:
35
+ answer: "Authentication in this project uses JWT tokens..."
36
+ sources:
37
+ - path: "API Documentation.md"
38
+ title: "API Documentation"
39
+ snippet: "The system uses JWT-based authentication..."
40
+ score: 0.92
41
+ notes_written: []
42
+ '400':
43
+ description: Invalid request (empty messages, invalid format)
44
+ content:
45
+ application/json:
46
+ schema:
47
+ $ref: '#/components/schemas/ErrorResponse'
48
+ '503':
49
+ description: AI service unavailable
50
+ content:
51
+ application/json:
52
+ schema:
53
+ $ref: '#/components/schemas/ErrorResponse'
54
+ example:
55
+ error: "AI service temporarily unavailable"
56
+ code: "SERVICE_UNAVAILABLE"
57
+
58
+ /api/rag/status:
59
+ get:
60
+ summary: Check RAG service status
61
+ description: Returns whether the index is ready and service is operational
62
+ operationId: ragStatus
63
+ tags:
64
+ - RAG
65
+ responses:
66
+ '200':
67
+ description: Service status
68
+ content:
69
+ application/json:
70
+ schema:
71
+ $ref: '#/components/schemas/StatusResponse'
72
+
73
+ components:
74
+ schemas:
75
+ ChatMessage:
76
+ type: object
77
+ required:
78
+ - role
79
+ - content
80
+ properties:
81
+ role:
82
+ type: string
83
+ enum: [user, assistant]
84
+ description: Message author
85
+ content:
86
+ type: string
87
+ maxLength: 10000
88
+ description: Message text
89
+ timestamp:
90
+ type: string
91
+ format: date-time
92
+ description: When message was created
93
+ sources:
94
+ type: array
95
+ items:
96
+ $ref: '#/components/schemas/SourceReference'
97
+ description: Referenced notes (assistant messages only)
98
+ notes_written:
99
+ type: array
100
+ items:
101
+ $ref: '#/components/schemas/NoteWritten'
102
+ description: Notes created by agent (Phase 2)
103
+
104
+ SourceReference:
105
+ type: object
106
+ required:
107
+ - path
108
+ - title
109
+ - snippet
110
+ properties:
111
+ path:
112
+ type: string
113
+ description: Relative path in vault
114
+ example: "guides/authentication.md"
115
+ title:
116
+ type: string
117
+ description: Note title
118
+ example: "Authentication Guide"
119
+ snippet:
120
+ type: string
121
+ maxLength: 500
122
+ description: Relevant text excerpt
123
+ score:
124
+ type: number
125
+ format: float
126
+ minimum: 0
127
+ maximum: 1
128
+ description: Relevance score
129
+
130
+ NoteWritten:
131
+ type: object
132
+ required:
133
+ - path
134
+ - title
135
+ - action
136
+ properties:
137
+ path:
138
+ type: string
139
+ description: Path to created/updated note
140
+ pattern: "^agent-notes/.+\\.md$"
141
+ title:
142
+ type: string
143
+ description: Note title
144
+ action:
145
+ type: string
146
+ enum: [created, updated]
147
+ description: What the agent did
148
+
149
+ ChatRequest:
150
+ type: object
151
+ required:
152
+ - messages
153
+ properties:
154
+ messages:
155
+ type: array
156
+ minItems: 1
157
+ items:
158
+ $ref: '#/components/schemas/ChatMessage'
159
+ description: Conversation history, last message must be from user
160
+
161
+ ChatResponse:
162
+ type: object
163
+ required:
164
+ - answer
165
+ - sources
166
+ - notes_written
167
+ properties:
168
+ answer:
169
+ type: string
170
+ description: AI-generated response
171
+ sources:
172
+ type: array
173
+ items:
174
+ $ref: '#/components/schemas/SourceReference'
175
+ description: Notes used in response
176
+ notes_written:
177
+ type: array
178
+ items:
179
+ $ref: '#/components/schemas/NoteWritten'
180
+ description: Notes created (Phase 2, may be empty)
181
+
182
+ StatusResponse:
183
+ type: object
184
+ required:
185
+ - status
186
+ - index_ready
187
+ properties:
188
+ status:
189
+ type: string
190
+ enum: [ready, initializing, error]
191
+ index_ready:
192
+ type: boolean
193
+ description: Whether the vector index is loaded
194
+ documents_indexed:
195
+ type: integer
196
+ description: Number of documents in index
197
+ error_message:
198
+ type: string
199
+ description: Error details if status is error
200
+
201
+ ErrorResponse:
202
+ type: object
203
+ required:
204
+ - error
205
+ - code
206
+ properties:
207
+ error:
208
+ type: string
209
+ description: Human-readable error message
210
+ code:
211
+ type: string
212
+ description: Machine-readable error code
213
+ enum:
214
+ - INVALID_REQUEST
215
+ - EMPTY_MESSAGES
216
+ - SERVICE_UNAVAILABLE
217
+ - INDEX_NOT_READY
218
+ - RATE_LIMITED
219
+
specs/004-gemini-vault-chat/data-model.md ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Data Model: Gemini Vault Chat Agent
2
+
3
+ **Feature**: 004-gemini-vault-chat
4
+ **Date**: 2025-11-28
5
+
6
+ ## Entities
7
+
8
+ ### ChatMessage
9
+
10
+ Represents a single message in a conversation.
11
+
12
+ | Field | Type | Description | Constraints |
13
+ |-------|------|-------------|-------------|
14
+ | role | enum | Message author | `user` or `assistant` |
15
+ | content | string | Message text | Max 10,000 characters |
16
+ | timestamp | datetime | When message was created | ISO 8601 format |
17
+ | sources | SourceReference[] | Referenced notes (assistant only) | Optional, empty for user messages |
18
+ | notes_written | NoteWritten[] | Notes created by agent | Optional, Phase 2 only |
19
+
20
+ ### SourceReference
21
+
22
+ Metadata about a note used to generate a response.
23
+
24
+ | Field | Type | Description | Constraints |
25
+ |-------|------|-------------|-------------|
26
+ | path | string | Relative path in vault | Valid vault path, ends in `.md` |
27
+ | title | string | Note title | Derived from frontmatter/H1/filename |
28
+ | snippet | string | Relevant text excerpt | Max 500 characters |
29
+ | score | float | Relevance score | 0.0 to 1.0, optional |
30
+
31
+ ### NoteWritten (Phase 2)
32
+
33
+ Metadata about a note created or updated by the agent.
34
+
35
+ | Field | Type | Description | Constraints |
36
+ |-------|------|-------------|-------------|
37
+ | path | string | Path to created/updated note | Must be in `agent-notes/` folder |
38
+ | title | string | Note title | Required |
39
+ | action | enum | What the agent did | `created` or `updated` |
40
+
41
+ ### ChatRequest
42
+
43
+ Request payload for the RAG chat endpoint.
44
+
45
+ | Field | Type | Description | Constraints |
46
+ |-------|------|-------------|-------------|
47
+ | messages | ChatMessage[] | Conversation history | At least 1 message, last must be `user` |
48
+
49
+ ### ChatResponse
50
+
51
+ Response payload from the RAG chat endpoint.
52
+
53
+ | Field | Type | Description | Constraints |
54
+ |-------|------|-------------|-------------|
55
+ | answer | string | AI-generated response | Required |
56
+ | sources | SourceReference[] | Notes used in response | May be empty |
57
+ | notes_written | NoteWritten[] | Notes created (Phase 2) | May be empty |
58
+
59
+ ## State Transitions
60
+
61
+ ### Conversation Session
62
+
63
+ ```
64
+ [No Session] ---(user opens chat panel)---> [Active Session]
65
+ [Active Session] ---(user sends message)---> [Waiting for Response]
66
+ [Waiting for Response] ---(response received)---> [Active Session]
67
+ [Active Session] ---(page refresh/close)---> [No Session]
68
+ ```
69
+
70
+ ### Index Lifecycle
71
+
72
+ ```
73
+ [No Index] ---(startup, no persist dir)---> [Building Index]
74
+ [Building Index] ---(indexing complete)---> [Index Ready]
75
+ [No Index] ---(startup, persist dir exists)---> [Loading Index]
76
+ [Loading Index] ---(load successful)---> [Index Ready]
77
+ [Loading Index] ---(load failed)---> [Building Index]
78
+ [Index Ready] ---(query received)---> [Index Ready]
79
+ ```
80
+
81
+ ## Validation Rules
82
+
83
+ ### ChatMessage Validation
84
+
85
+ 1. `role` must be exactly `user` or `assistant`
86
+ 2. `content` must not be empty (whitespace-only is invalid)
87
+ 3. `content` must be ≀10,000 characters
88
+ 4. `sources` must be empty for `user` role messages
89
+
90
+ ### SourceReference Validation
91
+
92
+ 1. `path` must be a valid vault path (see `validate_note_path` in vault.py)
93
+ 2. `title` must not be empty
94
+ 3. `snippet` must be ≀500 characters
95
+ 4. `score` if present must be between 0.0 and 1.0
96
+
97
+ ### NoteWritten Validation (Phase 2)
98
+
99
+ 1. `path` must start with `agent-notes/`
100
+ 2. `path` must be a valid vault path
101
+ 3. `action` must be `created` or `updated`
102
+
103
+ ## Relationships
104
+
105
+ ```
106
+ ChatSession (frontend state)
107
+ └── contains 0..* ChatMessage
108
+ └── assistant messages contain 0..* SourceReference
109
+ └── references 1 VaultNote (existing)
110
+ └── assistant messages may contain 0..* NoteWritten
111
+ └── creates/updates 1 VaultNote
112
+ ```
113
+
114
+ ## Persistence
115
+
116
+ | Entity | Storage | Lifetime |
117
+ |--------|---------|----------|
118
+ | ChatMessage | Frontend memory | Session (cleared on refresh) |
119
+ | SourceReference | Derived from query | Per response |
120
+ | NoteWritten | VaultService (filesystem) | Permanent |
121
+ | Vector Index | LlamaIndex persist dir | Until rebuild |
122
+
specs/004-gemini-vault-chat/plan.md ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Implementation Plan: Gemini Vault Chat Agent
2
+
3
+ **Branch**: `004-gemini-vault-chat` | **Date**: 2025-11-28 | **Spec**: [spec.md](./spec.md)
4
+ **Input**: Feature specification from `/specs/004-gemini-vault-chat/spec.md`
5
+
6
+ ## Summary
7
+
8
+ Add a Gemini-powered RAG chat agent to the Document-MCP platform. Users can ask natural language questions about their Markdown vault and receive AI-synthesized answers grounded in their documents. The system uses LlamaIndex for document indexing and retrieval, with Gemini as both the LLM and embedding model. An optional Phase 2 adds constrained note-writing capabilities.
9
+
10
+ ## Technical Context
11
+
12
+ **Language/Version**: Python 3.11+ (backend), TypeScript (frontend)
13
+ **Primary Dependencies**: FastAPI, LlamaIndex, llama-index-llms-google-genai, llama-index-embeddings-google-genai, React 18+, Tailwind CSS, Shadcn/UI
14
+ **Storage**: Filesystem vault (existing), LlamaIndex persisted vector store (new, under `data/llamaindex/`)
15
+ **Testing**: pytest (backend), manual verification (frontend)
16
+ **Target Platform**: Hugging Face Spaces (Docker), Linux server
17
+ **Project Type**: Web application (frontend + backend)
18
+ **Performance Goals**: <5 seconds for RAG response (per SC-001)
19
+ **Constraints**: Must not break existing MCP server or ChatGPT widget
20
+ **Scale/Scope**: Hackathon scaleβ€”index rebuilds acceptable on restart
21
+
22
+ ## Constitution Check
23
+
24
+ *GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
25
+
26
+ | Principle | Status | Notes |
27
+ |-----------|--------|-------|
28
+ | I. Brownfield Integration | βœ… Pass | Uses existing VaultService, adds new routes/services alongside existing code |
29
+ | II. Test-Backed Development | βœ… Pass | Plan includes pytest tests for RAG service; frontend is manual verification |
30
+ | III. Incremental Delivery | βœ… Pass | P1 stories (read-only RAG) can ship before P3 (write tools) |
31
+ | IV. Specification-Driven | βœ… Pass | All work traced to spec.md; Phase 2 is optional per spec |
32
+ | No Magic | βœ… Pass | Direct LlamaIndex usage, no custom abstractions |
33
+ | Single Source of Truth | βœ… Pass | Vault remains source of truth; index is derived view |
34
+ | Error Handling | βœ… Pass | Spec requires FR-011 error messages for AI unavailability |
35
+
36
+ **Technology Stack Compliance**:
37
+ - Backend: Python 3.11+, FastAPI, Pydantic βœ…
38
+ - Frontend: React 18+, TypeScript, Tailwind, Shadcn/UI βœ…
39
+ - Storage: Filesystem-based (LlamaIndex persisted store) βœ…
40
+
41
+ ## Project Structure
42
+
43
+ ### Documentation (this feature)
44
+
45
+ ```text
46
+ specs/004-gemini-vault-chat/
47
+ β”œβ”€β”€ plan.md # This file
48
+ β”œβ”€β”€ research.md # Phase 0 output
49
+ β”œβ”€β”€ data-model.md # Phase 1 output
50
+ β”œβ”€β”€ quickstart.md # Phase 1 output
51
+ β”œβ”€β”€ contracts/ # Phase 1 output
52
+ β”‚ └── rag-api.yaml # OpenAPI spec for RAG endpoints
53
+ └── tasks.md # Phase 2 output (created by /speckit.tasks)
54
+ ```
55
+
56
+ ### Source Code (repository root)
57
+
58
+ ```text
59
+ backend/
60
+ β”œβ”€β”€ src/
61
+ β”‚ β”œβ”€β”€ api/
62
+ β”‚ β”‚ └── routes/
63
+ β”‚ β”‚ └── rag.py # NEW: RAG chat endpoint
64
+ β”‚ β”œβ”€β”€ models/
65
+ β”‚ β”‚ └── rag.py # NEW: Pydantic models for RAG
66
+ β”‚ └── services/
67
+ β”‚ └── rag_index.py # NEW: LlamaIndex service
68
+ └── tests/
69
+ └── unit/
70
+ └── test_rag_service.py # NEW: RAG service tests
71
+
72
+ frontend/
73
+ β”œβ”€β”€ src/
74
+ β”‚ β”œβ”€β”€ components/
75
+ β”‚ β”‚ β”œβ”€β”€ ChatPanel.tsx # NEW: Chat interface
76
+ β”‚ β”‚ β”œβ”€β”€ ChatMessage.tsx # NEW: Message component
77
+ β”‚ β”‚ └── SourceList.tsx # NEW: Source references
78
+ β”‚ β”œβ”€β”€ services/
79
+ β”‚ β”‚ └── rag.ts # NEW: RAG API client
80
+ β”‚ └── types/
81
+ β”‚ └── rag.ts # NEW: TypeScript types
82
+
83
+ data/
84
+ └── llamaindex/ # NEW: Persisted vector index
85
+ ```
86
+
87
+ **Structure Decision**: Web application structure (Option 2). New files added alongside existing code per Constitution Principle I.
88
+
89
+ ## Complexity Tracking
90
+
91
+ > No violations requiring justification.
92
+
93
+ ## Implementation Phases
94
+
95
+ ### Phase 1: Core RAG Query (P1 Stories)
96
+
97
+ Implements User Stories 1-2: Ask questions, view sources.
98
+
99
+ **Backend Tasks**:
100
+ 1. Add LlamaIndex dependencies to `requirements.txt`
101
+ 2. Create `rag_index.py` service with `get_or_build_index()` singleton
102
+ 3. Create `rag.py` Pydantic models for request/response
103
+ 4. Create `rag.py` route with `POST /api/rag/chat` endpoint
104
+ 5. Add unit tests for RAG service
105
+
106
+ **Frontend Tasks**:
107
+ 1. Create `ChatPanel.tsx` component with message list and composer
108
+ 2. Create `ChatMessage.tsx` for rendering user/assistant messages
109
+ 3. Create `SourceList.tsx` for collapsible source references
110
+ 4. Add `rag.ts` API client service
111
+ 5. Integrate ChatPanel into MainApp layout
112
+
113
+ ### Phase 2: Multi-Turn Conversation (P2 Story)
114
+
115
+ Implements User Story 3: Context-aware follow-ups.
116
+
117
+ **Tasks**:
118
+ 1. Maintain chat history in frontend state
119
+ 2. Pass full message history to backend
120
+ 3. Update RAG service to use chat history for context
121
+
122
+ ### Phase 3: Agent Note Writing (P3 Story, Optional)
123
+
124
+ Implements User Story 4: Create/append notes via agent.
125
+
126
+ **Tasks**:
127
+ 1. Create constrained write helpers (`create_note`, `append_to_note`)
128
+ 2. Register as LlamaIndex agent tools
129
+ 3. Add `notes_written` to response model
130
+ 4. Show created notes badge in UI
specs/004-gemini-vault-chat/quickstart.md ADDED
@@ -0,0 +1,144 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Quickstart: Gemini Vault Chat Agent
2
+
3
+ **Feature**: 004-gemini-vault-chat
4
+ **Date**: 2025-11-28
5
+
6
+ ## Prerequisites
7
+
8
+ 1. Python 3.11+ installed
9
+ 2. Node.js 18+ installed
10
+ 3. Google API key with Gemini access
11
+
12
+ ## Setup
13
+
14
+ ### 1. Install Backend Dependencies
15
+
16
+ ```bash
17
+ cd backend
18
+ pip install llama-index llama-index-llms-google-genai llama-index-embeddings-google-genai
19
+ ```
20
+
21
+ ### 2. Configure Environment
22
+
23
+ Add to your `.env` file (or export in terminal):
24
+
25
+ ```bash
26
+ GOOGLE_API_KEY=your-gemini-api-key-here
27
+ VAULT_DIR=data/vaults/demo-user
28
+ LLAMAINDEX_PERSIST_DIR=data/llamaindex
29
+ ```
30
+
31
+ ### 3. Start Backend
32
+
33
+ ```bash
34
+ cd backend
35
+ uvicorn src.api.main:app --reload --port 8000
36
+ ```
37
+
38
+ The RAG index will be built on first startup (may take a few seconds).
39
+
40
+ ### 4. Start Frontend
41
+
42
+ ```bash
43
+ cd frontend
44
+ npm install
45
+ npm run dev
46
+ ```
47
+
48
+ ## Verify Installation
49
+
50
+ ### Check RAG Status
51
+
52
+ ```bash
53
+ curl http://localhost:8000/api/rag/status
54
+ ```
55
+
56
+ Expected response:
57
+ ```json
58
+ {
59
+ "status": "ready",
60
+ "index_ready": true,
61
+ "documents_indexed": 15
62
+ }
63
+ ```
64
+
65
+ ### Test RAG Chat
66
+
67
+ ```bash
68
+ curl -X POST http://localhost:8000/api/rag/chat \
69
+ -H "Content-Type: application/json" \
70
+ -d '{"messages": [{"role": "user", "content": "What is this project about?"}]}'
71
+ ```
72
+
73
+ Expected response:
74
+ ```json
75
+ {
76
+ "answer": "This project is Document-MCP, a...",
77
+ "sources": [
78
+ {
79
+ "path": "Getting Started.md",
80
+ "title": "Getting Started",
81
+ "snippet": "Document-MCP provides...",
82
+ "score": 0.89
83
+ }
84
+ ],
85
+ "notes_written": []
86
+ }
87
+ ```
88
+
89
+ ## Development Workflow
90
+
91
+ ### Backend Changes
92
+
93
+ 1. Edit files in `backend/src/services/rag_index.py` or `backend/src/api/routes/rag.py`
94
+ 2. Server auto-reloads with `--reload` flag
95
+ 3. Run tests: `cd backend && pytest tests/unit/test_rag_service.py -v`
96
+
97
+ ### Frontend Changes
98
+
99
+ 1. Edit files in `frontend/src/components/` (ChatPanel, ChatMessage, SourceList)
100
+ 2. Vite auto-reloads on save
101
+ 3. Open browser at `http://localhost:5173`
102
+
103
+ ### Rebuilding the Index
104
+
105
+ Delete the persist directory and restart:
106
+
107
+ ```bash
108
+ rm -rf data/llamaindex
109
+ # Restart backend
110
+ ```
111
+
112
+ ## File Locations
113
+
114
+ | Component | Path |
115
+ |-----------|------|
116
+ | RAG Service | `backend/src/services/rag_index.py` |
117
+ | RAG Routes | `backend/src/api/routes/rag.py` |
118
+ | RAG Models | `backend/src/models/rag.py` |
119
+ | Chat Panel | `frontend/src/components/ChatPanel.tsx` |
120
+ | API Client | `frontend/src/services/rag.ts` |
121
+ | Types | `frontend/src/types/rag.ts` |
122
+ | Index Storage | `data/llamaindex/` |
123
+
124
+ ## Troubleshooting
125
+
126
+ ### "GOOGLE_API_KEY not set"
127
+
128
+ Ensure the environment variable is exported:
129
+ ```bash
130
+ export GOOGLE_API_KEY=your-key-here
131
+ ```
132
+
133
+ ### "Index not ready"
134
+
135
+ Wait a few seconds after startup for indexing to complete. Check logs for errors.
136
+
137
+ ### "Rate limited"
138
+
139
+ Gemini API has rate limits. Wait and retry, or check your API quota.
140
+
141
+ ### Empty sources in response
142
+
143
+ Check that your vault has Markdown files. Run `ls data/vaults/demo-user/` to verify.
144
+
specs/004-gemini-vault-chat/research.md ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Research: Gemini Vault Chat Agent
2
+
3
+ **Feature**: 004-gemini-vault-chat
4
+ **Date**: 2025-11-28
5
+
6
+ ## LlamaIndex Integration
7
+
8
+ ### Decision: Use LlamaIndex Core with Google GenAI Extensions
9
+
10
+ **Rationale**: LlamaIndex provides a mature, well-documented framework for building RAG applications. The `llama-index-llms-google-genai` and `llama-index-embeddings-google-genai` packages provide first-class Gemini support without requiring custom integration code.
11
+
12
+ **Alternatives Considered**:
13
+ - **LangChain**: More complex, larger dependency footprint. LlamaIndex is more focused on document retrieval use cases.
14
+ - **Direct Gemini API**: Would require implementing chunking, embedding, and retrieval logic manually. Higher development effort.
15
+ - **OpenAI + pgvector**: Requires PostgreSQL, conflicts with SQLite-only approach in constitution.
16
+
17
+ ### Key LlamaIndex Patterns
18
+
19
+ ```python
20
+ # Singleton index pattern (recommended)
21
+ from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext
22
+ from llama_index.core import load_index_from_storage
23
+ from llama_index.llms.google_genai import GoogleGenAI
24
+ from llama_index.embeddings.google_genai import GoogleGenAIEmbedding
25
+
26
+ _index: VectorStoreIndex | None = None
27
+
28
+ def get_or_build_index(vault_path: Path, persist_dir: Path) -> VectorStoreIndex:
29
+ global _index
30
+ if _index is not None:
31
+ return _index
32
+
33
+ if persist_dir.exists():
34
+ storage_context = StorageContext.from_defaults(persist_dir=str(persist_dir))
35
+ _index = load_index_from_storage(storage_context)
36
+ else:
37
+ documents = SimpleDirectoryReader(str(vault_path), recursive=True).load_data()
38
+ _index = VectorStoreIndex.from_documents(documents)
39
+ _index.storage_context.persist(persist_dir=str(persist_dir))
40
+
41
+ return _index
42
+ ```
43
+
44
+ ## Gemini Model Selection
45
+
46
+ ### Decision: gemini-1.5-flash for LLM, text-embedding-004 for Embeddings
47
+
48
+ **Rationale**:
49
+ - `gemini-1.5-flash` offers good balance of speed and quality for interactive chat
50
+ - `text-embedding-004` is Google's latest text embedding model with 768 dimensions
51
+ - Both are cost-effective for hackathon/demo scale
52
+
53
+ **Alternatives Considered**:
54
+ - `gemini-1.5-pro`: Higher quality but slower and more expensive
55
+ - `gemini-2.0-flash-exp`: Experimental, may not be stable
56
+
57
+ ### Environment Variables
58
+
59
+ ```
60
+ GOOGLE_API_KEY=<api-key>
61
+ VAULT_DIR=data/vaults/demo-user # Or dynamically per user
62
+ LLAMAINDEX_PERSIST_DIR=data/llamaindex
63
+ ```
64
+
65
+ ## Source Attribution Strategy
66
+
67
+ ### Decision: Extract source metadata from LlamaIndex response nodes
68
+
69
+ **Rationale**: LlamaIndex query responses include source nodes with file paths and text chunks. We can map these back to vault note paths and extract snippets for display.
70
+
71
+ ```python
72
+ response = query_engine.query(question)
73
+ sources = []
74
+ for node in response.source_nodes:
75
+ sources.append({
76
+ "path": node.metadata.get("file_path"),
77
+ "title": derive_title_from_path(node.metadata.get("file_path")),
78
+ "snippet": node.text[:200] + "..." if len(node.text) > 200 else node.text,
79
+ "score": node.score
80
+ })
81
+ ```
82
+
83
+ ## Multi-Turn Conversation
84
+
85
+ ### Decision: Use LlamaIndex ChatEngine for conversation memory
86
+
87
+ **Rationale**: LlamaIndex provides `as_chat_engine()` which wraps the index with conversation memory. This handles context naturally without custom implementation.
88
+
89
+ ```python
90
+ chat_engine = index.as_chat_engine(
91
+ chat_mode="context",
92
+ llm=GoogleGenAI(model="gemini-1.5-flash")
93
+ )
94
+ response = chat_engine.chat("Follow-up question here")
95
+ ```
96
+
97
+ **Note**: For MVP, we'll use a simpler approach where the frontend passes full message history and we construct context in the query. This avoids server-side session state.
98
+
99
+ ## Agent Tools (Phase 2)
100
+
101
+ ### Decision: Use LlamaIndex FunctionTool with constrained paths
102
+
103
+ **Rationale**: LlamaIndex supports registering Python functions as tools for agentic use. We can constrain write operations to an `agent-notes/` subdirectory.
104
+
105
+ ```python
106
+ from llama_index.core.tools import FunctionTool
107
+
108
+ def create_note(title: str, content: str) -> str:
109
+ """Create a new note in the agent folder."""
110
+ safe_filename = slugify(title)
111
+ path = f"agent-notes/{safe_filename}.md"
112
+ vault_service.write_note(user_id, path, title=title, body=content)
113
+ return f"Created note: {path}"
114
+
115
+ create_note_tool = FunctionTool.from_defaults(fn=create_note)
116
+ ```
117
+
118
+ ## Error Handling
119
+
120
+ ### Decision: Graceful degradation with user-friendly messages
121
+
122
+ **Patterns**:
123
+ 1. API key missing β†’ 503 "AI service not configured"
124
+ 2. API rate limit β†’ 429 "Please wait and try again"
125
+ 3. Network error β†’ 503 "AI service temporarily unavailable"
126
+ 4. Empty vault β†’ 200 with message "No documents indexed"
127
+
128
+ ## Performance Considerations
129
+
130
+ ### Index Persistence
131
+
132
+ - First indexing: ~1-5 seconds for small vaults (<100 notes)
133
+ - Subsequent loads: ~100ms from persisted storage
134
+ - Query latency: ~1-3 seconds depending on Gemini API response time
135
+
136
+ ### Recommendations
137
+
138
+ 1. Load index on startup (not on first request)
139
+ 2. Use environment variable to configure persist directory
140
+ 3. For large vaults, consider lazy loading or background indexing (post-MVP)
141
+
142
+ ## Dependencies to Add
143
+
144
+ ```
145
+ # requirements.txt additions
146
+ llama-index
147
+ llama-index-llms-google-genai
148
+ llama-index-embeddings-google-genai
149
+ ```
150
+
151
+ **Note**: These packages have their own dependencies (e.g., `google-generativeai`). Tested compatible with Python 3.11+.
152
+
specs/004-gemini-vault-chat/spec.md ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Feature Specification: Gemini Vault Chat Agent
2
+
3
+ **Feature Branch**: `004-gemini-vault-chat`
4
+ **Created**: 2025-11-28
5
+ **Status**: Draft
6
+ **Input**: User description: "Add a Gemini-powered planning chat agent using LlamaIndex for RAG over the Markdown vault. Use Gemini as both LLM and embedding model. Include a new chat panel in the HF Space frontend that calls a RAG backend endpoint, displays assistant responses with linked sources, and optionally allows the agent to write notes."
7
+
8
+ ## User Scenarios & Testing *(mandatory)*
9
+
10
+ ### User Story 1 - Ask Questions About Vault Content (Priority: P1)
11
+
12
+ A user opens the Gemini Planning Agent panel and asks a question about content stored in their Markdown vault. The system searches the vault, retrieves relevant passages, and returns an AI-generated answer that synthesizes information from the relevant notes.
13
+
14
+ **Why this priority**: This is the core value propositionβ€”enabling users to query their knowledge base conversationally and get AI-synthesized answers grounded in their own documents.
15
+
16
+ **Independent Test**: Can be fully tested by typing a question and verifying the response is relevant to vault content, with sources listed.
17
+
18
+ **Acceptance Scenarios**:
19
+
20
+ 1. **Given** a vault containing notes about project architecture, **When** user asks "How does authentication work?", **Then** the system returns an answer citing relevant notes with snippets
21
+ 2. **Given** a vault with multiple related notes, **When** user asks a question that spans multiple topics, **Then** the system synthesizes information from multiple sources and lists all referenced notes
22
+ 3. **Given** a vault with no relevant content, **When** user asks an unrelated question, **Then** the system responds that no relevant information was found in the vault
23
+
24
+ ---
25
+
26
+ ### User Story 2 - View Source Notes (Priority: P1)
27
+
28
+ After receiving an answer from the chat agent, the user can see which notes were used to generate the response. They can click on a source to view the note in the existing document viewer or see an inline snippet.
29
+
30
+ **Why this priority**: Source attribution is essential for trust and verification. Users need to know where information comes from and validate AI responses against original content.
31
+
32
+ **Independent Test**: Can be tested by receiving an answer and clicking on a listed source to verify it opens the correct note.
33
+
34
+ **Acceptance Scenarios**:
35
+
36
+ 1. **Given** an assistant response with sources, **When** user clicks a source link, **Then** the corresponding note opens in the document viewer
37
+ 2. **Given** an assistant response with sources, **When** user expands a source, **Then** they see a snippet of the relevant passage
38
+ 3. **Given** an assistant response, **When** sources are displayed, **Then** each source shows the note title and path
39
+
40
+ ---
41
+
42
+ ### User Story 3 - Multi-Turn Conversation (Priority: P2)
43
+
44
+ Users can have a multi-turn conversation with the agent, asking follow-up questions that build on previous context. The agent maintains conversation history for coherent responses.
45
+
46
+ **Why this priority**: Natural conversation flow improves user experience, but basic single-query functionality delivers core value first.
47
+
48
+ **Independent Test**: Can be tested by asking a question, then asking a follow-up that references "it" or "that", and verifying the agent understands the context.
49
+
50
+ **Acceptance Scenarios**:
51
+
52
+ 1. **Given** a previous question about "authentication", **When** user asks "How do I configure it?", **Then** the agent understands "it" refers to authentication
53
+ 2. **Given** an ongoing conversation, **When** user starts a new topic, **Then** the agent responds appropriately to the new context
54
+ 3. **Given** a conversation session, **When** user refreshes the page, **Then** conversation history is cleared (new session starts)
55
+
56
+ ---
57
+
58
+ ### User Story 4 - Agent Creates Notes (Priority: P3)
59
+
60
+ Users can instruct the agent to create new notes based on the conversation. The agent writes notes to a dedicated folder in the vault and informs the user what was created.
61
+
62
+ **Why this priority**: Note creation adds significant value but requires more complex safety controls. Core reading/query functionality should be solid first.
63
+
64
+ **Independent Test**: Can be tested by asking the agent to "create a summary note about X" and verifying a new note appears in the designated folder.
65
+
66
+ **Acceptance Scenarios**:
67
+
68
+ 1. **Given** a conversation about a topic, **When** user asks "create a summary note", **Then** the agent creates a new Markdown note in the agent folder
69
+ 2. **Given** an agent-created note, **When** user views the response, **Then** a badge or link shows the created note path
70
+ 3. **Given** an existing note, **When** user asks the agent to append content, **Then** the agent updates the existing note appropriately
71
+
72
+ ---
73
+
74
+ ### Edge Cases
75
+
76
+ - What happens when the vault is empty or has no indexed content? β†’ System returns a friendly message indicating no documents are available
77
+ - How does the system handle very long user queries? β†’ Query is truncated to reasonable limits with user notification
78
+ - What happens if the AI service is unavailable? β†’ System shows an error message and suggests retrying
79
+ - How are malformed or non-Markdown files handled? β†’ Non-Markdown files are ignored during indexing
80
+ - What if the agent tries to write outside the designated folder? β†’ Write operations are constrained to the agent folder only
81
+
82
+ ## Requirements *(mandatory)*
83
+
84
+ ### Functional Requirements
85
+
86
+ - **FR-001**: System MUST provide a chat interface for users to ask natural language questions about vault content
87
+ - **FR-002**: System MUST search the vault and retrieve relevant passages to answer user queries
88
+ - **FR-003**: System MUST generate AI responses that synthesize information from retrieved content
89
+ - **FR-004**: System MUST display source notes for each response, including note title and path
90
+ - **FR-005**: System MUST allow users to navigate from a source reference to the full note
91
+ - **FR-006**: System MUST maintain conversation history within a session for multi-turn dialogue
92
+ - **FR-007**: System MUST build and persist a searchable index of vault content
93
+ - **FR-008**: System MUST load an existing index on startup if available
94
+ - **FR-009**: System MUST constrain agent write operations to a designated agent folder only
95
+ - **FR-010**: System MUST display a notification when the agent creates or updates a note
96
+ - **FR-011**: System MUST show an appropriate error message if the AI service is unavailable
97
+
98
+ ### Key Entities
99
+
100
+ - **Chat Message**: Represents a single message in the conversation (role: user or assistant, content, timestamp)
101
+ - **Chat Session**: A collection of messages in a single conversation context (started when user opens panel, cleared on page refresh)
102
+ - **Source Reference**: Metadata about a note used to generate a response (note title, path, relevant snippet)
103
+ - **Agent Note**: A Markdown note created by the agent, stored in the designated agent folder
104
+
105
+ ## Success Criteria *(mandatory)*
106
+
107
+ ### Measurable Outcomes
108
+
109
+ - **SC-001**: Users receive a relevant answer with sources within 5 seconds of submitting a query
110
+ - **SC-002**: 90% of responses include at least one source reference when relevant content exists
111
+ - **SC-003**: Users can navigate from a source reference to the full note in one click
112
+ - **SC-004**: Multi-turn conversations correctly reference previous context in 80% of follow-up questions
113
+ - **SC-005**: Agent-created notes appear in the designated folder and are visible in the vault viewer within 2 seconds
114
+ - **SC-006**: System gracefully handles AI service unavailability with a clear error message
115
+
116
+ ## Assumptions
117
+
118
+ - Users have a Markdown vault with content they want to query
119
+ - The existing document viewer from the Docs Widget can be reused for viewing source notes
120
+ - Index rebuilds are acceptable on service restarts for the initial release
121
+ - Session history is ephemeral and not persisted across page refreshes
122
+ - Agent write operations are limited to creating and appending to notes (no deletion)
specs/004-gemini-vault-chat/tasks.md ADDED
@@ -0,0 +1,228 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Tasks: Gemini Vault Chat Agent
2
+
3
+ **Input**: Design documents from `/specs/004-gemini-vault-chat/`
4
+ **Prerequisites**: plan.md βœ…, spec.md βœ…, research.md βœ…, data-model.md βœ…, contracts/ βœ…
5
+
6
+ **Tests**: Unit tests for RAG service included per Constitution (Test-Backed Development).
7
+
8
+ **Organization**: Tasks grouped by user story for independent implementation and testing.
9
+
10
+ ## Format: `[ID] [P?] [Story] Description`
11
+
12
+ - **[P]**: Can run in parallel (different files, no dependencies)
13
+ - **[Story]**: Which user story this task belongs to (US1, US2, US3, US4)
14
+ - Include exact file paths in descriptions
15
+
16
+ ## Path Conventions
17
+
18
+ - **Backend**: `backend/src/`, `backend/tests/`
19
+ - **Frontend**: `frontend/src/`
20
+ - **Data**: `data/llamaindex/`
21
+
22
+ ---
23
+
24
+ ## Phase 1: Setup (Shared Infrastructure)
25
+
26
+ **Purpose**: Add dependencies and create type definitions
27
+
28
+ - [ ] T001 Add LlamaIndex dependencies to `backend/requirements.txt`: llama-index, llama-index-llms-google-genai, llama-index-embeddings-google-genai
29
+ - [ ] T002 [P] Create TypeScript types in `frontend/src/types/rag.ts`: ChatMessage, SourceReference, NoteWritten, ChatRequest, ChatResponse
30
+ - [ ] T003 [P] Add GOOGLE_API_KEY and LLAMAINDEX_PERSIST_DIR to environment configuration in `backend/src/services/config.py`
31
+
32
+ ---
33
+
34
+ ## Phase 2: Foundational (Blocking Prerequisites)
35
+
36
+ **Purpose**: Core backend infrastructure for RAG that all user stories depend on
37
+
38
+ **⚠️ CRITICAL**: No user story work can begin until this phase is complete
39
+
40
+ - [ ] T004 Create Pydantic models in `backend/src/models/rag.py`: ChatMessage, SourceReference, NoteWritten, ChatRequest, ChatResponse, StatusResponse, ErrorResponse
41
+ - [ ] T005 Create RAG index service skeleton in `backend/src/services/rag_index.py` with `get_or_build_index()` singleton pattern
42
+ - [ ] T006 Implement index persistence: load from `data/llamaindex/` if exists, otherwise build and persist in `backend/src/services/rag_index.py`
43
+ - [ ] T007 Create `backend/tests/unit/test_rag_service.py` with test stubs for index loading, query execution, and error handling
44
+ - [ ] T008 Register RAG routes in `backend/src/api/main.py` (import and include rag router)
45
+
46
+ **Checkpoint**: Foundation ready - RAG service can load/build index on startup
47
+
48
+ ---
49
+
50
+ ## Phase 3: User Story 1 & 2 - Ask Questions + View Sources (Priority: P1) 🎯 MVP
51
+
52
+ **Goal**: Users can ask questions and receive AI-synthesized answers with source attribution
53
+
54
+ **Independent Test**: Type a question in the chat panel, verify response includes answer text and clickable source references
55
+
56
+ ### Backend Implementation (US1+US2)
57
+
58
+ - [ ] T009 [US1] Implement `rag_chat()` function in `backend/src/services/rag_index.py` that queries index and returns answer with sources
59
+ - [ ] T010 [US1] Extract source metadata from LlamaIndex response nodes (path, title, snippet, score) in `backend/src/services/rag_index.py`
60
+ - [ ] T011 [US1] Create POST `/api/rag/chat` endpoint in `backend/src/api/routes/rag.py` wrapping `rag_chat()`
61
+ - [ ] T012 [P] [US1] Create GET `/api/rag/status` endpoint in `backend/src/api/routes/rag.py` returning index status
62
+ - [ ] T013 [US1] Implement unit tests for `rag_chat()` in `backend/tests/unit/test_rag_service.py`: happy path, no results, error handling
63
+
64
+ ### Frontend Implementation (US1+US2)
65
+
66
+ - [ ] T014 [P] [US2] Create RAG API client in `frontend/src/services/rag.ts` with `sendMessage()` and `getStatus()` functions
67
+ - [ ] T015 [P] [US2] Create ChatMessage component in `frontend/src/components/ChatMessage.tsx` rendering user/assistant messages
68
+ - [ ] T016 [P] [US2] Create SourceList component in `frontend/src/components/SourceList.tsx` with collapsible source references
69
+ - [ ] T017 [US1] Create ChatPanel component in `frontend/src/components/ChatPanel.tsx` with message list and composer textarea
70
+ - [ ] T018 [US1] Integrate ChatPanel into MainApp layout in `frontend/src/pages/MainApp.tsx` as new panel/tab
71
+ - [ ] T019 [US2] Wire SourceList click handler to open note in document viewer via existing navigation
72
+
73
+ **Checkpoint**: User can ask a question, see AI answer with sources, and click source to view note
74
+
75
+ ---
76
+
77
+ ## Phase 4: User Story 3 - Multi-Turn Conversation (Priority: P2)
78
+
79
+ **Goal**: Users can have context-aware follow-up conversations
80
+
81
+ **Independent Test**: Ask "What is authentication?", then ask "How do I configure it?" - verify agent understands "it" refers to authentication
82
+
83
+ ### Implementation (US3)
84
+
85
+ - [ ] T020 [US3] Add message history state management in `frontend/src/components/ChatPanel.tsx` using React useState
86
+ - [ ] T021 [US3] Pass full message history array to `POST /api/rag/chat` in `frontend/src/services/rag.ts`
87
+ - [ ] T022 [US3] Update `rag_chat()` in `backend/src/services/rag_index.py` to construct context from message history
88
+ - [ ] T023 [US3] Add conversation reset button in `frontend/src/components/ChatPanel.tsx` to clear history
89
+ - [ ] T024 [US3] Add unit test for multi-turn context handling in `backend/tests/unit/test_rag_service.py`
90
+
91
+ **Checkpoint**: Multi-turn conversation maintains context; page refresh clears history
92
+
93
+ ---
94
+
95
+ ## Phase 5: User Story 4 - Agent Creates Notes (Priority: P3, Optional)
96
+
97
+ **Goal**: Agent can create/append notes in a designated folder
98
+
99
+ **Independent Test**: Ask "create a summary note about authentication" - verify note appears in `agent-notes/` folder
100
+
101
+ ### Implementation (US4)
102
+
103
+ - [ ] T025 [US4] Create `create_note()` helper in `backend/src/services/rag_index.py` constrained to `agent-notes/` folder
104
+ - [ ] T026 [US4] Create `append_to_note()` helper in `backend/src/services/rag_index.py` for updating existing notes
105
+ - [ ] T027 [US4] Register helpers as LlamaIndex FunctionTools in `backend/src/services/rag_index.py`
106
+ - [ ] T028 [US4] Update `rag_chat()` to use agent mode with tools when write intent detected
107
+ - [ ] T029 [US4] Add `notes_written` to ChatResponse and include in API response from `backend/src/api/routes/rag.py`
108
+ - [ ] T030 [P] [US4] Add NoteWritten badge component in `frontend/src/components/ChatMessage.tsx` showing created note path
109
+ - [ ] T031 [US4] Wire badge click to navigate to created note in vault viewer
110
+ - [ ] T032 [US4] Add unit tests for constrained write operations in `backend/tests/unit/test_rag_service.py`
111
+
112
+ **Checkpoint**: Agent can create notes; writes constrained to `agent-notes/` folder
113
+
114
+ ---
115
+
116
+ ## Phase 6: Polish & Cross-Cutting Concerns
117
+
118
+ **Purpose**: Error handling, edge cases, and validation
119
+
120
+ - [ ] T033 [P] Implement error handling for missing GOOGLE_API_KEY with 503 response in `backend/src/services/rag_index.py`
121
+ - [ ] T034 [P] Implement error handling for API rate limits with 429 response in `backend/src/api/routes/rag.py`
122
+ - [ ] T035 [P] Add loading state and error display in `frontend/src/components/ChatPanel.tsx`
123
+ - [ ] T036 [P] Add empty vault message when no documents indexed in `backend/src/services/rag_index.py`
124
+ - [ ] T037 Run quickstart.md validation: verify all setup steps work
125
+ - [ ] T038 Manual E2E test: full user journey through all implemented stories
126
+
127
+ ---
128
+
129
+ ## Dependencies & Execution Order
130
+
131
+ ### Phase Dependencies
132
+
133
+ - **Setup (Phase 1)**: No dependencies - can start immediately
134
+ - **Foundational (Phase 2)**: Depends on Setup completion - BLOCKS all user stories
135
+ - **User Stories (Phase 3-5)**: All depend on Foundational phase completion
136
+ - US1+US2 (Phase 3) must complete before US3 (Phase 4)
137
+ - US3 can complete before US4 (Phase 5 is optional)
138
+ - **Polish (Phase 6)**: Can run after Phase 3 minimum
139
+
140
+ ### User Story Dependencies
141
+
142
+ - **User Story 1+2 (P1)**: Can start after Foundational (Phase 2) - No dependencies on other stories
143
+ - **User Story 3 (P2)**: Depends on US1+US2 for chat panel and history structure
144
+ - **User Story 4 (P3)**: Depends on US1+US2 for basic chat flow; optional feature
145
+
146
+ ### Within Each Phase
147
+
148
+ - Backend models before services
149
+ - Services before routes
150
+ - Backend before frontend integration
151
+ - Core implementation before error handling
152
+
153
+ ### Parallel Opportunities
154
+
155
+ **Phase 1:**
156
+ - T002 (TS types) and T003 (config) can run in parallel
157
+
158
+ **Phase 2:**
159
+ - T004 (models) must complete before T005-T008
160
+
161
+ **Phase 3:**
162
+ - T014, T015, T016 (frontend components) can run in parallel
163
+ - T012 (/status endpoint) can run in parallel with other backend work
164
+
165
+ **Phase 4:**
166
+ - T020-T024 are sequential (frontend then backend integration)
167
+
168
+ **Phase 5:**
169
+ - T025, T026 (helpers) sequential
170
+ - T030 (badge) can run parallel with backend once T029 complete
171
+
172
+ **Phase 6:**
173
+ - T033, T034, T035, T036 all parallel (different files)
174
+
175
+ ---
176
+
177
+ ## Parallel Example: Phase 3 Frontend
178
+
179
+ ```bash
180
+ # Launch all independent frontend components together:
181
+ Task: "Create RAG API client in frontend/src/services/rag.ts"
182
+ Task: "Create ChatMessage component in frontend/src/components/ChatMessage.tsx"
183
+ Task: "Create SourceList component in frontend/src/components/SourceList.tsx"
184
+ ```
185
+
186
+ ---
187
+
188
+ ## Implementation Strategy
189
+
190
+ ### MVP First (Phase 1-3 Only)
191
+
192
+ 1. Complete Phase 1: Setup (T001-T003)
193
+ 2. Complete Phase 2: Foundational (T004-T008)
194
+ 3. Complete Phase 3: User Story 1+2 (T009-T019)
195
+ 4. **STOP and VALIDATE**: Test RAG query and source display independently
196
+ 5. Deploy/demo if ready - this is the MVP!
197
+
198
+ ### Incremental Delivery
199
+
200
+ 1. Complete Setup + Foundational β†’ Foundation ready
201
+ 2. Add US1+US2 β†’ Test independently β†’ **Deploy/Demo (MVP!)**
202
+ 3. Add US3 β†’ Test multi-turn β†’ Deploy/Demo
203
+ 4. Add US4 (optional) β†’ Test note creation β†’ Deploy/Demo
204
+ 5. Each story adds value without breaking previous stories
205
+
206
+ ### Estimated Effort
207
+
208
+ | Phase | Tasks | Estimated Hours |
209
+ |-------|-------|-----------------|
210
+ | Setup | 3 | 0.5 |
211
+ | Foundational | 5 | 2 |
212
+ | US1+US2 (MVP) | 11 | 4 |
213
+ | US3 | 5 | 2 |
214
+ | US4 (optional) | 8 | 3 |
215
+ | Polish | 6 | 1.5 |
216
+ | **Total** | **38** | **13** |
217
+
218
+ ---
219
+
220
+ ## Notes
221
+
222
+ - [P] tasks = different files, no dependencies
223
+ - [Story] label maps task to specific user story for traceability
224
+ - US1 and US2 combined in Phase 3 since they're tightly coupled (source display is part of query response)
225
+ - US4 is optional per spec - can skip if time-constrained
226
+ - Constitution requires pytest tests for backend features
227
+ - Frontend testing is manual verification per Constitution
228
+