Spaces:
Running
Running
bigwolfe
commited on
Commit
Β·
05c9551
1
Parent(s):
2e98ac7
init
Browse files- CLAUDE.md +7 -0
- specs/004-gemini-vault-chat/checklists/requirements.md +37 -0
- specs/004-gemini-vault-chat/contracts/rag-api.yaml +219 -0
- specs/004-gemini-vault-chat/data-model.md +122 -0
- specs/004-gemini-vault-chat/plan.md +130 -0
- specs/004-gemini-vault-chat/quickstart.md +144 -0
- specs/004-gemini-vault-chat/research.md +152 -0
- specs/004-gemini-vault-chat/spec.md +122 -0
- specs/004-gemini-vault-chat/tasks.md +228 -0
CLAUDE.md
CHANGED
|
@@ -271,3 +271,10 @@ Current active feature: `001-obsidian-docs-viewer`
|
|
| 271 |
```
|
| 272 |
|
| 273 |
Obtain JWT: `POST /api/tokens` after HF OAuth login.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 271 |
```
|
| 272 |
|
| 273 |
Obtain JWT: `POST /api/tokens` after HF OAuth login.
|
| 274 |
+
|
| 275 |
+
## Active Technologies
|
| 276 |
+
- Python 3.11+ (backend), TypeScript (frontend) + FastAPI, LlamaIndex, llama-index-llms-google-genai, llama-index-embeddings-google-genai, React 18+, Tailwind CSS, Shadcn/UI (004-gemini-vault-chat)
|
| 277 |
+
- Filesystem vault (existing), LlamaIndex persisted vector store (new, under `data/llamaindex/`) (004-gemini-vault-chat)
|
| 278 |
+
|
| 279 |
+
## Recent Changes
|
| 280 |
+
- 004-gemini-vault-chat: Added Python 3.11+ (backend), TypeScript (frontend) + FastAPI, LlamaIndex, llama-index-llms-google-genai, llama-index-embeddings-google-genai, React 18+, Tailwind CSS, Shadcn/UI
|
specs/004-gemini-vault-chat/checklists/requirements.md
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Specification Quality Checklist: Gemini Vault Chat Agent
|
| 2 |
+
|
| 3 |
+
**Purpose**: Validate specification completeness and quality before proceeding to planning
|
| 4 |
+
**Created**: 2025-11-28
|
| 5 |
+
**Feature**: [spec.md](../spec.md)
|
| 6 |
+
|
| 7 |
+
## Content Quality
|
| 8 |
+
|
| 9 |
+
- [x] No implementation details (languages, frameworks, APIs)
|
| 10 |
+
- [x] Focused on user value and business needs
|
| 11 |
+
- [x] Written for non-technical stakeholders
|
| 12 |
+
- [x] All mandatory sections completed
|
| 13 |
+
|
| 14 |
+
## Requirement Completeness
|
| 15 |
+
|
| 16 |
+
- [x] No [NEEDS CLARIFICATION] markers remain
|
| 17 |
+
- [x] Requirements are testable and unambiguous
|
| 18 |
+
- [x] Success criteria are measurable
|
| 19 |
+
- [x] Success criteria are technology-agnostic (no implementation details)
|
| 20 |
+
- [x] All acceptance scenarios are defined
|
| 21 |
+
- [x] Edge cases are identified
|
| 22 |
+
- [x] Scope is clearly bounded
|
| 23 |
+
- [x] Dependencies and assumptions identified
|
| 24 |
+
|
| 25 |
+
## Feature Readiness
|
| 26 |
+
|
| 27 |
+
- [x] All functional requirements have clear acceptance criteria
|
| 28 |
+
- [x] User scenarios cover primary flows
|
| 29 |
+
- [x] Feature meets measurable outcomes defined in Success Criteria
|
| 30 |
+
- [x] No implementation details leak into specification
|
| 31 |
+
|
| 32 |
+
## Notes
|
| 33 |
+
|
| 34 |
+
- All validation items passed on first review
|
| 35 |
+
- Spec is ready for `/speckit.plan` phase
|
| 36 |
+
- The feature input was detailed from existing planning documents, enabling a complete spec without clarifications
|
| 37 |
+
|
specs/004-gemini-vault-chat/contracts/rag-api.yaml
ADDED
|
@@ -0,0 +1,219 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
openapi: 3.0.3
|
| 2 |
+
info:
|
| 3 |
+
title: Document-MCP RAG Chat API
|
| 4 |
+
description: Gemini-powered RAG chat endpoints for vault content querying
|
| 5 |
+
version: 1.0.0
|
| 6 |
+
|
| 7 |
+
paths:
|
| 8 |
+
/api/rag/chat:
|
| 9 |
+
post:
|
| 10 |
+
summary: Send a message and get AI-generated response
|
| 11 |
+
description: |
|
| 12 |
+
Accepts conversation history and returns an AI-synthesized answer
|
| 13 |
+
based on vault content, along with source references.
|
| 14 |
+
operationId: ragChat
|
| 15 |
+
tags:
|
| 16 |
+
- RAG
|
| 17 |
+
requestBody:
|
| 18 |
+
required: true
|
| 19 |
+
content:
|
| 20 |
+
application/json:
|
| 21 |
+
schema:
|
| 22 |
+
$ref: '#/components/schemas/ChatRequest'
|
| 23 |
+
example:
|
| 24 |
+
messages:
|
| 25 |
+
- role: user
|
| 26 |
+
content: "How does authentication work in this project?"
|
| 27 |
+
responses:
|
| 28 |
+
'200':
|
| 29 |
+
description: Successful response with answer and sources
|
| 30 |
+
content:
|
| 31 |
+
application/json:
|
| 32 |
+
schema:
|
| 33 |
+
$ref: '#/components/schemas/ChatResponse'
|
| 34 |
+
example:
|
| 35 |
+
answer: "Authentication in this project uses JWT tokens..."
|
| 36 |
+
sources:
|
| 37 |
+
- path: "API Documentation.md"
|
| 38 |
+
title: "API Documentation"
|
| 39 |
+
snippet: "The system uses JWT-based authentication..."
|
| 40 |
+
score: 0.92
|
| 41 |
+
notes_written: []
|
| 42 |
+
'400':
|
| 43 |
+
description: Invalid request (empty messages, invalid format)
|
| 44 |
+
content:
|
| 45 |
+
application/json:
|
| 46 |
+
schema:
|
| 47 |
+
$ref: '#/components/schemas/ErrorResponse'
|
| 48 |
+
'503':
|
| 49 |
+
description: AI service unavailable
|
| 50 |
+
content:
|
| 51 |
+
application/json:
|
| 52 |
+
schema:
|
| 53 |
+
$ref: '#/components/schemas/ErrorResponse'
|
| 54 |
+
example:
|
| 55 |
+
error: "AI service temporarily unavailable"
|
| 56 |
+
code: "SERVICE_UNAVAILABLE"
|
| 57 |
+
|
| 58 |
+
/api/rag/status:
|
| 59 |
+
get:
|
| 60 |
+
summary: Check RAG service status
|
| 61 |
+
description: Returns whether the index is ready and service is operational
|
| 62 |
+
operationId: ragStatus
|
| 63 |
+
tags:
|
| 64 |
+
- RAG
|
| 65 |
+
responses:
|
| 66 |
+
'200':
|
| 67 |
+
description: Service status
|
| 68 |
+
content:
|
| 69 |
+
application/json:
|
| 70 |
+
schema:
|
| 71 |
+
$ref: '#/components/schemas/StatusResponse'
|
| 72 |
+
|
| 73 |
+
components:
|
| 74 |
+
schemas:
|
| 75 |
+
ChatMessage:
|
| 76 |
+
type: object
|
| 77 |
+
required:
|
| 78 |
+
- role
|
| 79 |
+
- content
|
| 80 |
+
properties:
|
| 81 |
+
role:
|
| 82 |
+
type: string
|
| 83 |
+
enum: [user, assistant]
|
| 84 |
+
description: Message author
|
| 85 |
+
content:
|
| 86 |
+
type: string
|
| 87 |
+
maxLength: 10000
|
| 88 |
+
description: Message text
|
| 89 |
+
timestamp:
|
| 90 |
+
type: string
|
| 91 |
+
format: date-time
|
| 92 |
+
description: When message was created
|
| 93 |
+
sources:
|
| 94 |
+
type: array
|
| 95 |
+
items:
|
| 96 |
+
$ref: '#/components/schemas/SourceReference'
|
| 97 |
+
description: Referenced notes (assistant messages only)
|
| 98 |
+
notes_written:
|
| 99 |
+
type: array
|
| 100 |
+
items:
|
| 101 |
+
$ref: '#/components/schemas/NoteWritten'
|
| 102 |
+
description: Notes created by agent (Phase 2)
|
| 103 |
+
|
| 104 |
+
SourceReference:
|
| 105 |
+
type: object
|
| 106 |
+
required:
|
| 107 |
+
- path
|
| 108 |
+
- title
|
| 109 |
+
- snippet
|
| 110 |
+
properties:
|
| 111 |
+
path:
|
| 112 |
+
type: string
|
| 113 |
+
description: Relative path in vault
|
| 114 |
+
example: "guides/authentication.md"
|
| 115 |
+
title:
|
| 116 |
+
type: string
|
| 117 |
+
description: Note title
|
| 118 |
+
example: "Authentication Guide"
|
| 119 |
+
snippet:
|
| 120 |
+
type: string
|
| 121 |
+
maxLength: 500
|
| 122 |
+
description: Relevant text excerpt
|
| 123 |
+
score:
|
| 124 |
+
type: number
|
| 125 |
+
format: float
|
| 126 |
+
minimum: 0
|
| 127 |
+
maximum: 1
|
| 128 |
+
description: Relevance score
|
| 129 |
+
|
| 130 |
+
NoteWritten:
|
| 131 |
+
type: object
|
| 132 |
+
required:
|
| 133 |
+
- path
|
| 134 |
+
- title
|
| 135 |
+
- action
|
| 136 |
+
properties:
|
| 137 |
+
path:
|
| 138 |
+
type: string
|
| 139 |
+
description: Path to created/updated note
|
| 140 |
+
pattern: "^agent-notes/.+\\.md$"
|
| 141 |
+
title:
|
| 142 |
+
type: string
|
| 143 |
+
description: Note title
|
| 144 |
+
action:
|
| 145 |
+
type: string
|
| 146 |
+
enum: [created, updated]
|
| 147 |
+
description: What the agent did
|
| 148 |
+
|
| 149 |
+
ChatRequest:
|
| 150 |
+
type: object
|
| 151 |
+
required:
|
| 152 |
+
- messages
|
| 153 |
+
properties:
|
| 154 |
+
messages:
|
| 155 |
+
type: array
|
| 156 |
+
minItems: 1
|
| 157 |
+
items:
|
| 158 |
+
$ref: '#/components/schemas/ChatMessage'
|
| 159 |
+
description: Conversation history, last message must be from user
|
| 160 |
+
|
| 161 |
+
ChatResponse:
|
| 162 |
+
type: object
|
| 163 |
+
required:
|
| 164 |
+
- answer
|
| 165 |
+
- sources
|
| 166 |
+
- notes_written
|
| 167 |
+
properties:
|
| 168 |
+
answer:
|
| 169 |
+
type: string
|
| 170 |
+
description: AI-generated response
|
| 171 |
+
sources:
|
| 172 |
+
type: array
|
| 173 |
+
items:
|
| 174 |
+
$ref: '#/components/schemas/SourceReference'
|
| 175 |
+
description: Notes used in response
|
| 176 |
+
notes_written:
|
| 177 |
+
type: array
|
| 178 |
+
items:
|
| 179 |
+
$ref: '#/components/schemas/NoteWritten'
|
| 180 |
+
description: Notes created (Phase 2, may be empty)
|
| 181 |
+
|
| 182 |
+
StatusResponse:
|
| 183 |
+
type: object
|
| 184 |
+
required:
|
| 185 |
+
- status
|
| 186 |
+
- index_ready
|
| 187 |
+
properties:
|
| 188 |
+
status:
|
| 189 |
+
type: string
|
| 190 |
+
enum: [ready, initializing, error]
|
| 191 |
+
index_ready:
|
| 192 |
+
type: boolean
|
| 193 |
+
description: Whether the vector index is loaded
|
| 194 |
+
documents_indexed:
|
| 195 |
+
type: integer
|
| 196 |
+
description: Number of documents in index
|
| 197 |
+
error_message:
|
| 198 |
+
type: string
|
| 199 |
+
description: Error details if status is error
|
| 200 |
+
|
| 201 |
+
ErrorResponse:
|
| 202 |
+
type: object
|
| 203 |
+
required:
|
| 204 |
+
- error
|
| 205 |
+
- code
|
| 206 |
+
properties:
|
| 207 |
+
error:
|
| 208 |
+
type: string
|
| 209 |
+
description: Human-readable error message
|
| 210 |
+
code:
|
| 211 |
+
type: string
|
| 212 |
+
description: Machine-readable error code
|
| 213 |
+
enum:
|
| 214 |
+
- INVALID_REQUEST
|
| 215 |
+
- EMPTY_MESSAGES
|
| 216 |
+
- SERVICE_UNAVAILABLE
|
| 217 |
+
- INDEX_NOT_READY
|
| 218 |
+
- RATE_LIMITED
|
| 219 |
+
|
specs/004-gemini-vault-chat/data-model.md
ADDED
|
@@ -0,0 +1,122 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Data Model: Gemini Vault Chat Agent
|
| 2 |
+
|
| 3 |
+
**Feature**: 004-gemini-vault-chat
|
| 4 |
+
**Date**: 2025-11-28
|
| 5 |
+
|
| 6 |
+
## Entities
|
| 7 |
+
|
| 8 |
+
### ChatMessage
|
| 9 |
+
|
| 10 |
+
Represents a single message in a conversation.
|
| 11 |
+
|
| 12 |
+
| Field | Type | Description | Constraints |
|
| 13 |
+
|-------|------|-------------|-------------|
|
| 14 |
+
| role | enum | Message author | `user` or `assistant` |
|
| 15 |
+
| content | string | Message text | Max 10,000 characters |
|
| 16 |
+
| timestamp | datetime | When message was created | ISO 8601 format |
|
| 17 |
+
| sources | SourceReference[] | Referenced notes (assistant only) | Optional, empty for user messages |
|
| 18 |
+
| notes_written | NoteWritten[] | Notes created by agent | Optional, Phase 2 only |
|
| 19 |
+
|
| 20 |
+
### SourceReference
|
| 21 |
+
|
| 22 |
+
Metadata about a note used to generate a response.
|
| 23 |
+
|
| 24 |
+
| Field | Type | Description | Constraints |
|
| 25 |
+
|-------|------|-------------|-------------|
|
| 26 |
+
| path | string | Relative path in vault | Valid vault path, ends in `.md` |
|
| 27 |
+
| title | string | Note title | Derived from frontmatter/H1/filename |
|
| 28 |
+
| snippet | string | Relevant text excerpt | Max 500 characters |
|
| 29 |
+
| score | float | Relevance score | 0.0 to 1.0, optional |
|
| 30 |
+
|
| 31 |
+
### NoteWritten (Phase 2)
|
| 32 |
+
|
| 33 |
+
Metadata about a note created or updated by the agent.
|
| 34 |
+
|
| 35 |
+
| Field | Type | Description | Constraints |
|
| 36 |
+
|-------|------|-------------|-------------|
|
| 37 |
+
| path | string | Path to created/updated note | Must be in `agent-notes/` folder |
|
| 38 |
+
| title | string | Note title | Required |
|
| 39 |
+
| action | enum | What the agent did | `created` or `updated` |
|
| 40 |
+
|
| 41 |
+
### ChatRequest
|
| 42 |
+
|
| 43 |
+
Request payload for the RAG chat endpoint.
|
| 44 |
+
|
| 45 |
+
| Field | Type | Description | Constraints |
|
| 46 |
+
|-------|------|-------------|-------------|
|
| 47 |
+
| messages | ChatMessage[] | Conversation history | At least 1 message, last must be `user` |
|
| 48 |
+
|
| 49 |
+
### ChatResponse
|
| 50 |
+
|
| 51 |
+
Response payload from the RAG chat endpoint.
|
| 52 |
+
|
| 53 |
+
| Field | Type | Description | Constraints |
|
| 54 |
+
|-------|------|-------------|-------------|
|
| 55 |
+
| answer | string | AI-generated response | Required |
|
| 56 |
+
| sources | SourceReference[] | Notes used in response | May be empty |
|
| 57 |
+
| notes_written | NoteWritten[] | Notes created (Phase 2) | May be empty |
|
| 58 |
+
|
| 59 |
+
## State Transitions
|
| 60 |
+
|
| 61 |
+
### Conversation Session
|
| 62 |
+
|
| 63 |
+
```
|
| 64 |
+
[No Session] ---(user opens chat panel)---> [Active Session]
|
| 65 |
+
[Active Session] ---(user sends message)---> [Waiting for Response]
|
| 66 |
+
[Waiting for Response] ---(response received)---> [Active Session]
|
| 67 |
+
[Active Session] ---(page refresh/close)---> [No Session]
|
| 68 |
+
```
|
| 69 |
+
|
| 70 |
+
### Index Lifecycle
|
| 71 |
+
|
| 72 |
+
```
|
| 73 |
+
[No Index] ---(startup, no persist dir)---> [Building Index]
|
| 74 |
+
[Building Index] ---(indexing complete)---> [Index Ready]
|
| 75 |
+
[No Index] ---(startup, persist dir exists)---> [Loading Index]
|
| 76 |
+
[Loading Index] ---(load successful)---> [Index Ready]
|
| 77 |
+
[Loading Index] ---(load failed)---> [Building Index]
|
| 78 |
+
[Index Ready] ---(query received)---> [Index Ready]
|
| 79 |
+
```
|
| 80 |
+
|
| 81 |
+
## Validation Rules
|
| 82 |
+
|
| 83 |
+
### ChatMessage Validation
|
| 84 |
+
|
| 85 |
+
1. `role` must be exactly `user` or `assistant`
|
| 86 |
+
2. `content` must not be empty (whitespace-only is invalid)
|
| 87 |
+
3. `content` must be β€10,000 characters
|
| 88 |
+
4. `sources` must be empty for `user` role messages
|
| 89 |
+
|
| 90 |
+
### SourceReference Validation
|
| 91 |
+
|
| 92 |
+
1. `path` must be a valid vault path (see `validate_note_path` in vault.py)
|
| 93 |
+
2. `title` must not be empty
|
| 94 |
+
3. `snippet` must be β€500 characters
|
| 95 |
+
4. `score` if present must be between 0.0 and 1.0
|
| 96 |
+
|
| 97 |
+
### NoteWritten Validation (Phase 2)
|
| 98 |
+
|
| 99 |
+
1. `path` must start with `agent-notes/`
|
| 100 |
+
2. `path` must be a valid vault path
|
| 101 |
+
3. `action` must be `created` or `updated`
|
| 102 |
+
|
| 103 |
+
## Relationships
|
| 104 |
+
|
| 105 |
+
```
|
| 106 |
+
ChatSession (frontend state)
|
| 107 |
+
βββ contains 0..* ChatMessage
|
| 108 |
+
βββ assistant messages contain 0..* SourceReference
|
| 109 |
+
βββ references 1 VaultNote (existing)
|
| 110 |
+
βββ assistant messages may contain 0..* NoteWritten
|
| 111 |
+
βββ creates/updates 1 VaultNote
|
| 112 |
+
```
|
| 113 |
+
|
| 114 |
+
## Persistence
|
| 115 |
+
|
| 116 |
+
| Entity | Storage | Lifetime |
|
| 117 |
+
|--------|---------|----------|
|
| 118 |
+
| ChatMessage | Frontend memory | Session (cleared on refresh) |
|
| 119 |
+
| SourceReference | Derived from query | Per response |
|
| 120 |
+
| NoteWritten | VaultService (filesystem) | Permanent |
|
| 121 |
+
| Vector Index | LlamaIndex persist dir | Until rebuild |
|
| 122 |
+
|
specs/004-gemini-vault-chat/plan.md
ADDED
|
@@ -0,0 +1,130 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Implementation Plan: Gemini Vault Chat Agent
|
| 2 |
+
|
| 3 |
+
**Branch**: `004-gemini-vault-chat` | **Date**: 2025-11-28 | **Spec**: [spec.md](./spec.md)
|
| 4 |
+
**Input**: Feature specification from `/specs/004-gemini-vault-chat/spec.md`
|
| 5 |
+
|
| 6 |
+
## Summary
|
| 7 |
+
|
| 8 |
+
Add a Gemini-powered RAG chat agent to the Document-MCP platform. Users can ask natural language questions about their Markdown vault and receive AI-synthesized answers grounded in their documents. The system uses LlamaIndex for document indexing and retrieval, with Gemini as both the LLM and embedding model. An optional Phase 2 adds constrained note-writing capabilities.
|
| 9 |
+
|
| 10 |
+
## Technical Context
|
| 11 |
+
|
| 12 |
+
**Language/Version**: Python 3.11+ (backend), TypeScript (frontend)
|
| 13 |
+
**Primary Dependencies**: FastAPI, LlamaIndex, llama-index-llms-google-genai, llama-index-embeddings-google-genai, React 18+, Tailwind CSS, Shadcn/UI
|
| 14 |
+
**Storage**: Filesystem vault (existing), LlamaIndex persisted vector store (new, under `data/llamaindex/`)
|
| 15 |
+
**Testing**: pytest (backend), manual verification (frontend)
|
| 16 |
+
**Target Platform**: Hugging Face Spaces (Docker), Linux server
|
| 17 |
+
**Project Type**: Web application (frontend + backend)
|
| 18 |
+
**Performance Goals**: <5 seconds for RAG response (per SC-001)
|
| 19 |
+
**Constraints**: Must not break existing MCP server or ChatGPT widget
|
| 20 |
+
**Scale/Scope**: Hackathon scaleβindex rebuilds acceptable on restart
|
| 21 |
+
|
| 22 |
+
## Constitution Check
|
| 23 |
+
|
| 24 |
+
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
|
| 25 |
+
|
| 26 |
+
| Principle | Status | Notes |
|
| 27 |
+
|-----------|--------|-------|
|
| 28 |
+
| I. Brownfield Integration | β
Pass | Uses existing VaultService, adds new routes/services alongside existing code |
|
| 29 |
+
| II. Test-Backed Development | β
Pass | Plan includes pytest tests for RAG service; frontend is manual verification |
|
| 30 |
+
| III. Incremental Delivery | β
Pass | P1 stories (read-only RAG) can ship before P3 (write tools) |
|
| 31 |
+
| IV. Specification-Driven | β
Pass | All work traced to spec.md; Phase 2 is optional per spec |
|
| 32 |
+
| No Magic | β
Pass | Direct LlamaIndex usage, no custom abstractions |
|
| 33 |
+
| Single Source of Truth | β
Pass | Vault remains source of truth; index is derived view |
|
| 34 |
+
| Error Handling | β
Pass | Spec requires FR-011 error messages for AI unavailability |
|
| 35 |
+
|
| 36 |
+
**Technology Stack Compliance**:
|
| 37 |
+
- Backend: Python 3.11+, FastAPI, Pydantic β
|
| 38 |
+
- Frontend: React 18+, TypeScript, Tailwind, Shadcn/UI β
|
| 39 |
+
- Storage: Filesystem-based (LlamaIndex persisted store) β
|
| 40 |
+
|
| 41 |
+
## Project Structure
|
| 42 |
+
|
| 43 |
+
### Documentation (this feature)
|
| 44 |
+
|
| 45 |
+
```text
|
| 46 |
+
specs/004-gemini-vault-chat/
|
| 47 |
+
βββ plan.md # This file
|
| 48 |
+
βββ research.md # Phase 0 output
|
| 49 |
+
βββ data-model.md # Phase 1 output
|
| 50 |
+
βββ quickstart.md # Phase 1 output
|
| 51 |
+
βββ contracts/ # Phase 1 output
|
| 52 |
+
β βββ rag-api.yaml # OpenAPI spec for RAG endpoints
|
| 53 |
+
βββ tasks.md # Phase 2 output (created by /speckit.tasks)
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
### Source Code (repository root)
|
| 57 |
+
|
| 58 |
+
```text
|
| 59 |
+
backend/
|
| 60 |
+
βββ src/
|
| 61 |
+
β βββ api/
|
| 62 |
+
β β βββ routes/
|
| 63 |
+
β β βββ rag.py # NEW: RAG chat endpoint
|
| 64 |
+
β βββ models/
|
| 65 |
+
β β βββ rag.py # NEW: Pydantic models for RAG
|
| 66 |
+
β βββ services/
|
| 67 |
+
β βββ rag_index.py # NEW: LlamaIndex service
|
| 68 |
+
βββ tests/
|
| 69 |
+
βββ unit/
|
| 70 |
+
βββ test_rag_service.py # NEW: RAG service tests
|
| 71 |
+
|
| 72 |
+
frontend/
|
| 73 |
+
βββ src/
|
| 74 |
+
β βββ components/
|
| 75 |
+
β β βββ ChatPanel.tsx # NEW: Chat interface
|
| 76 |
+
β β βββ ChatMessage.tsx # NEW: Message component
|
| 77 |
+
β β βββ SourceList.tsx # NEW: Source references
|
| 78 |
+
β βββ services/
|
| 79 |
+
β β βββ rag.ts # NEW: RAG API client
|
| 80 |
+
β βββ types/
|
| 81 |
+
β βββ rag.ts # NEW: TypeScript types
|
| 82 |
+
|
| 83 |
+
data/
|
| 84 |
+
βββ llamaindex/ # NEW: Persisted vector index
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
**Structure Decision**: Web application structure (Option 2). New files added alongside existing code per Constitution Principle I.
|
| 88 |
+
|
| 89 |
+
## Complexity Tracking
|
| 90 |
+
|
| 91 |
+
> No violations requiring justification.
|
| 92 |
+
|
| 93 |
+
## Implementation Phases
|
| 94 |
+
|
| 95 |
+
### Phase 1: Core RAG Query (P1 Stories)
|
| 96 |
+
|
| 97 |
+
Implements User Stories 1-2: Ask questions, view sources.
|
| 98 |
+
|
| 99 |
+
**Backend Tasks**:
|
| 100 |
+
1. Add LlamaIndex dependencies to `requirements.txt`
|
| 101 |
+
2. Create `rag_index.py` service with `get_or_build_index()` singleton
|
| 102 |
+
3. Create `rag.py` Pydantic models for request/response
|
| 103 |
+
4. Create `rag.py` route with `POST /api/rag/chat` endpoint
|
| 104 |
+
5. Add unit tests for RAG service
|
| 105 |
+
|
| 106 |
+
**Frontend Tasks**:
|
| 107 |
+
1. Create `ChatPanel.tsx` component with message list and composer
|
| 108 |
+
2. Create `ChatMessage.tsx` for rendering user/assistant messages
|
| 109 |
+
3. Create `SourceList.tsx` for collapsible source references
|
| 110 |
+
4. Add `rag.ts` API client service
|
| 111 |
+
5. Integrate ChatPanel into MainApp layout
|
| 112 |
+
|
| 113 |
+
### Phase 2: Multi-Turn Conversation (P2 Story)
|
| 114 |
+
|
| 115 |
+
Implements User Story 3: Context-aware follow-ups.
|
| 116 |
+
|
| 117 |
+
**Tasks**:
|
| 118 |
+
1. Maintain chat history in frontend state
|
| 119 |
+
2. Pass full message history to backend
|
| 120 |
+
3. Update RAG service to use chat history for context
|
| 121 |
+
|
| 122 |
+
### Phase 3: Agent Note Writing (P3 Story, Optional)
|
| 123 |
+
|
| 124 |
+
Implements User Story 4: Create/append notes via agent.
|
| 125 |
+
|
| 126 |
+
**Tasks**:
|
| 127 |
+
1. Create constrained write helpers (`create_note`, `append_to_note`)
|
| 128 |
+
2. Register as LlamaIndex agent tools
|
| 129 |
+
3. Add `notes_written` to response model
|
| 130 |
+
4. Show created notes badge in UI
|
specs/004-gemini-vault-chat/quickstart.md
ADDED
|
@@ -0,0 +1,144 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Quickstart: Gemini Vault Chat Agent
|
| 2 |
+
|
| 3 |
+
**Feature**: 004-gemini-vault-chat
|
| 4 |
+
**Date**: 2025-11-28
|
| 5 |
+
|
| 6 |
+
## Prerequisites
|
| 7 |
+
|
| 8 |
+
1. Python 3.11+ installed
|
| 9 |
+
2. Node.js 18+ installed
|
| 10 |
+
3. Google API key with Gemini access
|
| 11 |
+
|
| 12 |
+
## Setup
|
| 13 |
+
|
| 14 |
+
### 1. Install Backend Dependencies
|
| 15 |
+
|
| 16 |
+
```bash
|
| 17 |
+
cd backend
|
| 18 |
+
pip install llama-index llama-index-llms-google-genai llama-index-embeddings-google-genai
|
| 19 |
+
```
|
| 20 |
+
|
| 21 |
+
### 2. Configure Environment
|
| 22 |
+
|
| 23 |
+
Add to your `.env` file (or export in terminal):
|
| 24 |
+
|
| 25 |
+
```bash
|
| 26 |
+
GOOGLE_API_KEY=your-gemini-api-key-here
|
| 27 |
+
VAULT_DIR=data/vaults/demo-user
|
| 28 |
+
LLAMAINDEX_PERSIST_DIR=data/llamaindex
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
+
### 3. Start Backend
|
| 32 |
+
|
| 33 |
+
```bash
|
| 34 |
+
cd backend
|
| 35 |
+
uvicorn src.api.main:app --reload --port 8000
|
| 36 |
+
```
|
| 37 |
+
|
| 38 |
+
The RAG index will be built on first startup (may take a few seconds).
|
| 39 |
+
|
| 40 |
+
### 4. Start Frontend
|
| 41 |
+
|
| 42 |
+
```bash
|
| 43 |
+
cd frontend
|
| 44 |
+
npm install
|
| 45 |
+
npm run dev
|
| 46 |
+
```
|
| 47 |
+
|
| 48 |
+
## Verify Installation
|
| 49 |
+
|
| 50 |
+
### Check RAG Status
|
| 51 |
+
|
| 52 |
+
```bash
|
| 53 |
+
curl http://localhost:8000/api/rag/status
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
Expected response:
|
| 57 |
+
```json
|
| 58 |
+
{
|
| 59 |
+
"status": "ready",
|
| 60 |
+
"index_ready": true,
|
| 61 |
+
"documents_indexed": 15
|
| 62 |
+
}
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
### Test RAG Chat
|
| 66 |
+
|
| 67 |
+
```bash
|
| 68 |
+
curl -X POST http://localhost:8000/api/rag/chat \
|
| 69 |
+
-H "Content-Type: application/json" \
|
| 70 |
+
-d '{"messages": [{"role": "user", "content": "What is this project about?"}]}'
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
Expected response:
|
| 74 |
+
```json
|
| 75 |
+
{
|
| 76 |
+
"answer": "This project is Document-MCP, a...",
|
| 77 |
+
"sources": [
|
| 78 |
+
{
|
| 79 |
+
"path": "Getting Started.md",
|
| 80 |
+
"title": "Getting Started",
|
| 81 |
+
"snippet": "Document-MCP provides...",
|
| 82 |
+
"score": 0.89
|
| 83 |
+
}
|
| 84 |
+
],
|
| 85 |
+
"notes_written": []
|
| 86 |
+
}
|
| 87 |
+
```
|
| 88 |
+
|
| 89 |
+
## Development Workflow
|
| 90 |
+
|
| 91 |
+
### Backend Changes
|
| 92 |
+
|
| 93 |
+
1. Edit files in `backend/src/services/rag_index.py` or `backend/src/api/routes/rag.py`
|
| 94 |
+
2. Server auto-reloads with `--reload` flag
|
| 95 |
+
3. Run tests: `cd backend && pytest tests/unit/test_rag_service.py -v`
|
| 96 |
+
|
| 97 |
+
### Frontend Changes
|
| 98 |
+
|
| 99 |
+
1. Edit files in `frontend/src/components/` (ChatPanel, ChatMessage, SourceList)
|
| 100 |
+
2. Vite auto-reloads on save
|
| 101 |
+
3. Open browser at `http://localhost:5173`
|
| 102 |
+
|
| 103 |
+
### Rebuilding the Index
|
| 104 |
+
|
| 105 |
+
Delete the persist directory and restart:
|
| 106 |
+
|
| 107 |
+
```bash
|
| 108 |
+
rm -rf data/llamaindex
|
| 109 |
+
# Restart backend
|
| 110 |
+
```
|
| 111 |
+
|
| 112 |
+
## File Locations
|
| 113 |
+
|
| 114 |
+
| Component | Path |
|
| 115 |
+
|-----------|------|
|
| 116 |
+
| RAG Service | `backend/src/services/rag_index.py` |
|
| 117 |
+
| RAG Routes | `backend/src/api/routes/rag.py` |
|
| 118 |
+
| RAG Models | `backend/src/models/rag.py` |
|
| 119 |
+
| Chat Panel | `frontend/src/components/ChatPanel.tsx` |
|
| 120 |
+
| API Client | `frontend/src/services/rag.ts` |
|
| 121 |
+
| Types | `frontend/src/types/rag.ts` |
|
| 122 |
+
| Index Storage | `data/llamaindex/` |
|
| 123 |
+
|
| 124 |
+
## Troubleshooting
|
| 125 |
+
|
| 126 |
+
### "GOOGLE_API_KEY not set"
|
| 127 |
+
|
| 128 |
+
Ensure the environment variable is exported:
|
| 129 |
+
```bash
|
| 130 |
+
export GOOGLE_API_KEY=your-key-here
|
| 131 |
+
```
|
| 132 |
+
|
| 133 |
+
### "Index not ready"
|
| 134 |
+
|
| 135 |
+
Wait a few seconds after startup for indexing to complete. Check logs for errors.
|
| 136 |
+
|
| 137 |
+
### "Rate limited"
|
| 138 |
+
|
| 139 |
+
Gemini API has rate limits. Wait and retry, or check your API quota.
|
| 140 |
+
|
| 141 |
+
### Empty sources in response
|
| 142 |
+
|
| 143 |
+
Check that your vault has Markdown files. Run `ls data/vaults/demo-user/` to verify.
|
| 144 |
+
|
specs/004-gemini-vault-chat/research.md
ADDED
|
@@ -0,0 +1,152 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Research: Gemini Vault Chat Agent
|
| 2 |
+
|
| 3 |
+
**Feature**: 004-gemini-vault-chat
|
| 4 |
+
**Date**: 2025-11-28
|
| 5 |
+
|
| 6 |
+
## LlamaIndex Integration
|
| 7 |
+
|
| 8 |
+
### Decision: Use LlamaIndex Core with Google GenAI Extensions
|
| 9 |
+
|
| 10 |
+
**Rationale**: LlamaIndex provides a mature, well-documented framework for building RAG applications. The `llama-index-llms-google-genai` and `llama-index-embeddings-google-genai` packages provide first-class Gemini support without requiring custom integration code.
|
| 11 |
+
|
| 12 |
+
**Alternatives Considered**:
|
| 13 |
+
- **LangChain**: More complex, larger dependency footprint. LlamaIndex is more focused on document retrieval use cases.
|
| 14 |
+
- **Direct Gemini API**: Would require implementing chunking, embedding, and retrieval logic manually. Higher development effort.
|
| 15 |
+
- **OpenAI + pgvector**: Requires PostgreSQL, conflicts with SQLite-only approach in constitution.
|
| 16 |
+
|
| 17 |
+
### Key LlamaIndex Patterns
|
| 18 |
+
|
| 19 |
+
```python
|
| 20 |
+
# Singleton index pattern (recommended)
|
| 21 |
+
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext
|
| 22 |
+
from llama_index.core import load_index_from_storage
|
| 23 |
+
from llama_index.llms.google_genai import GoogleGenAI
|
| 24 |
+
from llama_index.embeddings.google_genai import GoogleGenAIEmbedding
|
| 25 |
+
|
| 26 |
+
_index: VectorStoreIndex | None = None
|
| 27 |
+
|
| 28 |
+
def get_or_build_index(vault_path: Path, persist_dir: Path) -> VectorStoreIndex:
|
| 29 |
+
global _index
|
| 30 |
+
if _index is not None:
|
| 31 |
+
return _index
|
| 32 |
+
|
| 33 |
+
if persist_dir.exists():
|
| 34 |
+
storage_context = StorageContext.from_defaults(persist_dir=str(persist_dir))
|
| 35 |
+
_index = load_index_from_storage(storage_context)
|
| 36 |
+
else:
|
| 37 |
+
documents = SimpleDirectoryReader(str(vault_path), recursive=True).load_data()
|
| 38 |
+
_index = VectorStoreIndex.from_documents(documents)
|
| 39 |
+
_index.storage_context.persist(persist_dir=str(persist_dir))
|
| 40 |
+
|
| 41 |
+
return _index
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
## Gemini Model Selection
|
| 45 |
+
|
| 46 |
+
### Decision: gemini-1.5-flash for LLM, text-embedding-004 for Embeddings
|
| 47 |
+
|
| 48 |
+
**Rationale**:
|
| 49 |
+
- `gemini-1.5-flash` offers good balance of speed and quality for interactive chat
|
| 50 |
+
- `text-embedding-004` is Google's latest text embedding model with 768 dimensions
|
| 51 |
+
- Both are cost-effective for hackathon/demo scale
|
| 52 |
+
|
| 53 |
+
**Alternatives Considered**:
|
| 54 |
+
- `gemini-1.5-pro`: Higher quality but slower and more expensive
|
| 55 |
+
- `gemini-2.0-flash-exp`: Experimental, may not be stable
|
| 56 |
+
|
| 57 |
+
### Environment Variables
|
| 58 |
+
|
| 59 |
+
```
|
| 60 |
+
GOOGLE_API_KEY=<api-key>
|
| 61 |
+
VAULT_DIR=data/vaults/demo-user # Or dynamically per user
|
| 62 |
+
LLAMAINDEX_PERSIST_DIR=data/llamaindex
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
## Source Attribution Strategy
|
| 66 |
+
|
| 67 |
+
### Decision: Extract source metadata from LlamaIndex response nodes
|
| 68 |
+
|
| 69 |
+
**Rationale**: LlamaIndex query responses include source nodes with file paths and text chunks. We can map these back to vault note paths and extract snippets for display.
|
| 70 |
+
|
| 71 |
+
```python
|
| 72 |
+
response = query_engine.query(question)
|
| 73 |
+
sources = []
|
| 74 |
+
for node in response.source_nodes:
|
| 75 |
+
sources.append({
|
| 76 |
+
"path": node.metadata.get("file_path"),
|
| 77 |
+
"title": derive_title_from_path(node.metadata.get("file_path")),
|
| 78 |
+
"snippet": node.text[:200] + "..." if len(node.text) > 200 else node.text,
|
| 79 |
+
"score": node.score
|
| 80 |
+
})
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
## Multi-Turn Conversation
|
| 84 |
+
|
| 85 |
+
### Decision: Use LlamaIndex ChatEngine for conversation memory
|
| 86 |
+
|
| 87 |
+
**Rationale**: LlamaIndex provides `as_chat_engine()` which wraps the index with conversation memory. This handles context naturally without custom implementation.
|
| 88 |
+
|
| 89 |
+
```python
|
| 90 |
+
chat_engine = index.as_chat_engine(
|
| 91 |
+
chat_mode="context",
|
| 92 |
+
llm=GoogleGenAI(model="gemini-1.5-flash")
|
| 93 |
+
)
|
| 94 |
+
response = chat_engine.chat("Follow-up question here")
|
| 95 |
+
```
|
| 96 |
+
|
| 97 |
+
**Note**: For MVP, we'll use a simpler approach where the frontend passes full message history and we construct context in the query. This avoids server-side session state.
|
| 98 |
+
|
| 99 |
+
## Agent Tools (Phase 2)
|
| 100 |
+
|
| 101 |
+
### Decision: Use LlamaIndex FunctionTool with constrained paths
|
| 102 |
+
|
| 103 |
+
**Rationale**: LlamaIndex supports registering Python functions as tools for agentic use. We can constrain write operations to an `agent-notes/` subdirectory.
|
| 104 |
+
|
| 105 |
+
```python
|
| 106 |
+
from llama_index.core.tools import FunctionTool
|
| 107 |
+
|
| 108 |
+
def create_note(title: str, content: str) -> str:
|
| 109 |
+
"""Create a new note in the agent folder."""
|
| 110 |
+
safe_filename = slugify(title)
|
| 111 |
+
path = f"agent-notes/{safe_filename}.md"
|
| 112 |
+
vault_service.write_note(user_id, path, title=title, body=content)
|
| 113 |
+
return f"Created note: {path}"
|
| 114 |
+
|
| 115 |
+
create_note_tool = FunctionTool.from_defaults(fn=create_note)
|
| 116 |
+
```
|
| 117 |
+
|
| 118 |
+
## Error Handling
|
| 119 |
+
|
| 120 |
+
### Decision: Graceful degradation with user-friendly messages
|
| 121 |
+
|
| 122 |
+
**Patterns**:
|
| 123 |
+
1. API key missing β 503 "AI service not configured"
|
| 124 |
+
2. API rate limit β 429 "Please wait and try again"
|
| 125 |
+
3. Network error β 503 "AI service temporarily unavailable"
|
| 126 |
+
4. Empty vault β 200 with message "No documents indexed"
|
| 127 |
+
|
| 128 |
+
## Performance Considerations
|
| 129 |
+
|
| 130 |
+
### Index Persistence
|
| 131 |
+
|
| 132 |
+
- First indexing: ~1-5 seconds for small vaults (<100 notes)
|
| 133 |
+
- Subsequent loads: ~100ms from persisted storage
|
| 134 |
+
- Query latency: ~1-3 seconds depending on Gemini API response time
|
| 135 |
+
|
| 136 |
+
### Recommendations
|
| 137 |
+
|
| 138 |
+
1. Load index on startup (not on first request)
|
| 139 |
+
2. Use environment variable to configure persist directory
|
| 140 |
+
3. For large vaults, consider lazy loading or background indexing (post-MVP)
|
| 141 |
+
|
| 142 |
+
## Dependencies to Add
|
| 143 |
+
|
| 144 |
+
```
|
| 145 |
+
# requirements.txt additions
|
| 146 |
+
llama-index
|
| 147 |
+
llama-index-llms-google-genai
|
| 148 |
+
llama-index-embeddings-google-genai
|
| 149 |
+
```
|
| 150 |
+
|
| 151 |
+
**Note**: These packages have their own dependencies (e.g., `google-generativeai`). Tested compatible with Python 3.11+.
|
| 152 |
+
|
specs/004-gemini-vault-chat/spec.md
ADDED
|
@@ -0,0 +1,122 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Feature Specification: Gemini Vault Chat Agent
|
| 2 |
+
|
| 3 |
+
**Feature Branch**: `004-gemini-vault-chat`
|
| 4 |
+
**Created**: 2025-11-28
|
| 5 |
+
**Status**: Draft
|
| 6 |
+
**Input**: User description: "Add a Gemini-powered planning chat agent using LlamaIndex for RAG over the Markdown vault. Use Gemini as both LLM and embedding model. Include a new chat panel in the HF Space frontend that calls a RAG backend endpoint, displays assistant responses with linked sources, and optionally allows the agent to write notes."
|
| 7 |
+
|
| 8 |
+
## User Scenarios & Testing *(mandatory)*
|
| 9 |
+
|
| 10 |
+
### User Story 1 - Ask Questions About Vault Content (Priority: P1)
|
| 11 |
+
|
| 12 |
+
A user opens the Gemini Planning Agent panel and asks a question about content stored in their Markdown vault. The system searches the vault, retrieves relevant passages, and returns an AI-generated answer that synthesizes information from the relevant notes.
|
| 13 |
+
|
| 14 |
+
**Why this priority**: This is the core value propositionβenabling users to query their knowledge base conversationally and get AI-synthesized answers grounded in their own documents.
|
| 15 |
+
|
| 16 |
+
**Independent Test**: Can be fully tested by typing a question and verifying the response is relevant to vault content, with sources listed.
|
| 17 |
+
|
| 18 |
+
**Acceptance Scenarios**:
|
| 19 |
+
|
| 20 |
+
1. **Given** a vault containing notes about project architecture, **When** user asks "How does authentication work?", **Then** the system returns an answer citing relevant notes with snippets
|
| 21 |
+
2. **Given** a vault with multiple related notes, **When** user asks a question that spans multiple topics, **Then** the system synthesizes information from multiple sources and lists all referenced notes
|
| 22 |
+
3. **Given** a vault with no relevant content, **When** user asks an unrelated question, **Then** the system responds that no relevant information was found in the vault
|
| 23 |
+
|
| 24 |
+
---
|
| 25 |
+
|
| 26 |
+
### User Story 2 - View Source Notes (Priority: P1)
|
| 27 |
+
|
| 28 |
+
After receiving an answer from the chat agent, the user can see which notes were used to generate the response. They can click on a source to view the note in the existing document viewer or see an inline snippet.
|
| 29 |
+
|
| 30 |
+
**Why this priority**: Source attribution is essential for trust and verification. Users need to know where information comes from and validate AI responses against original content.
|
| 31 |
+
|
| 32 |
+
**Independent Test**: Can be tested by receiving an answer and clicking on a listed source to verify it opens the correct note.
|
| 33 |
+
|
| 34 |
+
**Acceptance Scenarios**:
|
| 35 |
+
|
| 36 |
+
1. **Given** an assistant response with sources, **When** user clicks a source link, **Then** the corresponding note opens in the document viewer
|
| 37 |
+
2. **Given** an assistant response with sources, **When** user expands a source, **Then** they see a snippet of the relevant passage
|
| 38 |
+
3. **Given** an assistant response, **When** sources are displayed, **Then** each source shows the note title and path
|
| 39 |
+
|
| 40 |
+
---
|
| 41 |
+
|
| 42 |
+
### User Story 3 - Multi-Turn Conversation (Priority: P2)
|
| 43 |
+
|
| 44 |
+
Users can have a multi-turn conversation with the agent, asking follow-up questions that build on previous context. The agent maintains conversation history for coherent responses.
|
| 45 |
+
|
| 46 |
+
**Why this priority**: Natural conversation flow improves user experience, but basic single-query functionality delivers core value first.
|
| 47 |
+
|
| 48 |
+
**Independent Test**: Can be tested by asking a question, then asking a follow-up that references "it" or "that", and verifying the agent understands the context.
|
| 49 |
+
|
| 50 |
+
**Acceptance Scenarios**:
|
| 51 |
+
|
| 52 |
+
1. **Given** a previous question about "authentication", **When** user asks "How do I configure it?", **Then** the agent understands "it" refers to authentication
|
| 53 |
+
2. **Given** an ongoing conversation, **When** user starts a new topic, **Then** the agent responds appropriately to the new context
|
| 54 |
+
3. **Given** a conversation session, **When** user refreshes the page, **Then** conversation history is cleared (new session starts)
|
| 55 |
+
|
| 56 |
+
---
|
| 57 |
+
|
| 58 |
+
### User Story 4 - Agent Creates Notes (Priority: P3)
|
| 59 |
+
|
| 60 |
+
Users can instruct the agent to create new notes based on the conversation. The agent writes notes to a dedicated folder in the vault and informs the user what was created.
|
| 61 |
+
|
| 62 |
+
**Why this priority**: Note creation adds significant value but requires more complex safety controls. Core reading/query functionality should be solid first.
|
| 63 |
+
|
| 64 |
+
**Independent Test**: Can be tested by asking the agent to "create a summary note about X" and verifying a new note appears in the designated folder.
|
| 65 |
+
|
| 66 |
+
**Acceptance Scenarios**:
|
| 67 |
+
|
| 68 |
+
1. **Given** a conversation about a topic, **When** user asks "create a summary note", **Then** the agent creates a new Markdown note in the agent folder
|
| 69 |
+
2. **Given** an agent-created note, **When** user views the response, **Then** a badge or link shows the created note path
|
| 70 |
+
3. **Given** an existing note, **When** user asks the agent to append content, **Then** the agent updates the existing note appropriately
|
| 71 |
+
|
| 72 |
+
---
|
| 73 |
+
|
| 74 |
+
### Edge Cases
|
| 75 |
+
|
| 76 |
+
- What happens when the vault is empty or has no indexed content? β System returns a friendly message indicating no documents are available
|
| 77 |
+
- How does the system handle very long user queries? β Query is truncated to reasonable limits with user notification
|
| 78 |
+
- What happens if the AI service is unavailable? β System shows an error message and suggests retrying
|
| 79 |
+
- How are malformed or non-Markdown files handled? β Non-Markdown files are ignored during indexing
|
| 80 |
+
- What if the agent tries to write outside the designated folder? β Write operations are constrained to the agent folder only
|
| 81 |
+
|
| 82 |
+
## Requirements *(mandatory)*
|
| 83 |
+
|
| 84 |
+
### Functional Requirements
|
| 85 |
+
|
| 86 |
+
- **FR-001**: System MUST provide a chat interface for users to ask natural language questions about vault content
|
| 87 |
+
- **FR-002**: System MUST search the vault and retrieve relevant passages to answer user queries
|
| 88 |
+
- **FR-003**: System MUST generate AI responses that synthesize information from retrieved content
|
| 89 |
+
- **FR-004**: System MUST display source notes for each response, including note title and path
|
| 90 |
+
- **FR-005**: System MUST allow users to navigate from a source reference to the full note
|
| 91 |
+
- **FR-006**: System MUST maintain conversation history within a session for multi-turn dialogue
|
| 92 |
+
- **FR-007**: System MUST build and persist a searchable index of vault content
|
| 93 |
+
- **FR-008**: System MUST load an existing index on startup if available
|
| 94 |
+
- **FR-009**: System MUST constrain agent write operations to a designated agent folder only
|
| 95 |
+
- **FR-010**: System MUST display a notification when the agent creates or updates a note
|
| 96 |
+
- **FR-011**: System MUST show an appropriate error message if the AI service is unavailable
|
| 97 |
+
|
| 98 |
+
### Key Entities
|
| 99 |
+
|
| 100 |
+
- **Chat Message**: Represents a single message in the conversation (role: user or assistant, content, timestamp)
|
| 101 |
+
- **Chat Session**: A collection of messages in a single conversation context (started when user opens panel, cleared on page refresh)
|
| 102 |
+
- **Source Reference**: Metadata about a note used to generate a response (note title, path, relevant snippet)
|
| 103 |
+
- **Agent Note**: A Markdown note created by the agent, stored in the designated agent folder
|
| 104 |
+
|
| 105 |
+
## Success Criteria *(mandatory)*
|
| 106 |
+
|
| 107 |
+
### Measurable Outcomes
|
| 108 |
+
|
| 109 |
+
- **SC-001**: Users receive a relevant answer with sources within 5 seconds of submitting a query
|
| 110 |
+
- **SC-002**: 90% of responses include at least one source reference when relevant content exists
|
| 111 |
+
- **SC-003**: Users can navigate from a source reference to the full note in one click
|
| 112 |
+
- **SC-004**: Multi-turn conversations correctly reference previous context in 80% of follow-up questions
|
| 113 |
+
- **SC-005**: Agent-created notes appear in the designated folder and are visible in the vault viewer within 2 seconds
|
| 114 |
+
- **SC-006**: System gracefully handles AI service unavailability with a clear error message
|
| 115 |
+
|
| 116 |
+
## Assumptions
|
| 117 |
+
|
| 118 |
+
- Users have a Markdown vault with content they want to query
|
| 119 |
+
- The existing document viewer from the Docs Widget can be reused for viewing source notes
|
| 120 |
+
- Index rebuilds are acceptable on service restarts for the initial release
|
| 121 |
+
- Session history is ephemeral and not persisted across page refreshes
|
| 122 |
+
- Agent write operations are limited to creating and appending to notes (no deletion)
|
specs/004-gemini-vault-chat/tasks.md
ADDED
|
@@ -0,0 +1,228 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Tasks: Gemini Vault Chat Agent
|
| 2 |
+
|
| 3 |
+
**Input**: Design documents from `/specs/004-gemini-vault-chat/`
|
| 4 |
+
**Prerequisites**: plan.md β
, spec.md β
, research.md β
, data-model.md β
, contracts/ β
|
| 5 |
+
|
| 6 |
+
**Tests**: Unit tests for RAG service included per Constitution (Test-Backed Development).
|
| 7 |
+
|
| 8 |
+
**Organization**: Tasks grouped by user story for independent implementation and testing.
|
| 9 |
+
|
| 10 |
+
## Format: `[ID] [P?] [Story] Description`
|
| 11 |
+
|
| 12 |
+
- **[P]**: Can run in parallel (different files, no dependencies)
|
| 13 |
+
- **[Story]**: Which user story this task belongs to (US1, US2, US3, US4)
|
| 14 |
+
- Include exact file paths in descriptions
|
| 15 |
+
|
| 16 |
+
## Path Conventions
|
| 17 |
+
|
| 18 |
+
- **Backend**: `backend/src/`, `backend/tests/`
|
| 19 |
+
- **Frontend**: `frontend/src/`
|
| 20 |
+
- **Data**: `data/llamaindex/`
|
| 21 |
+
|
| 22 |
+
---
|
| 23 |
+
|
| 24 |
+
## Phase 1: Setup (Shared Infrastructure)
|
| 25 |
+
|
| 26 |
+
**Purpose**: Add dependencies and create type definitions
|
| 27 |
+
|
| 28 |
+
- [ ] T001 Add LlamaIndex dependencies to `backend/requirements.txt`: llama-index, llama-index-llms-google-genai, llama-index-embeddings-google-genai
|
| 29 |
+
- [ ] T002 [P] Create TypeScript types in `frontend/src/types/rag.ts`: ChatMessage, SourceReference, NoteWritten, ChatRequest, ChatResponse
|
| 30 |
+
- [ ] T003 [P] Add GOOGLE_API_KEY and LLAMAINDEX_PERSIST_DIR to environment configuration in `backend/src/services/config.py`
|
| 31 |
+
|
| 32 |
+
---
|
| 33 |
+
|
| 34 |
+
## Phase 2: Foundational (Blocking Prerequisites)
|
| 35 |
+
|
| 36 |
+
**Purpose**: Core backend infrastructure for RAG that all user stories depend on
|
| 37 |
+
|
| 38 |
+
**β οΈ CRITICAL**: No user story work can begin until this phase is complete
|
| 39 |
+
|
| 40 |
+
- [ ] T004 Create Pydantic models in `backend/src/models/rag.py`: ChatMessage, SourceReference, NoteWritten, ChatRequest, ChatResponse, StatusResponse, ErrorResponse
|
| 41 |
+
- [ ] T005 Create RAG index service skeleton in `backend/src/services/rag_index.py` with `get_or_build_index()` singleton pattern
|
| 42 |
+
- [ ] T006 Implement index persistence: load from `data/llamaindex/` if exists, otherwise build and persist in `backend/src/services/rag_index.py`
|
| 43 |
+
- [ ] T007 Create `backend/tests/unit/test_rag_service.py` with test stubs for index loading, query execution, and error handling
|
| 44 |
+
- [ ] T008 Register RAG routes in `backend/src/api/main.py` (import and include rag router)
|
| 45 |
+
|
| 46 |
+
**Checkpoint**: Foundation ready - RAG service can load/build index on startup
|
| 47 |
+
|
| 48 |
+
---
|
| 49 |
+
|
| 50 |
+
## Phase 3: User Story 1 & 2 - Ask Questions + View Sources (Priority: P1) π― MVP
|
| 51 |
+
|
| 52 |
+
**Goal**: Users can ask questions and receive AI-synthesized answers with source attribution
|
| 53 |
+
|
| 54 |
+
**Independent Test**: Type a question in the chat panel, verify response includes answer text and clickable source references
|
| 55 |
+
|
| 56 |
+
### Backend Implementation (US1+US2)
|
| 57 |
+
|
| 58 |
+
- [ ] T009 [US1] Implement `rag_chat()` function in `backend/src/services/rag_index.py` that queries index and returns answer with sources
|
| 59 |
+
- [ ] T010 [US1] Extract source metadata from LlamaIndex response nodes (path, title, snippet, score) in `backend/src/services/rag_index.py`
|
| 60 |
+
- [ ] T011 [US1] Create POST `/api/rag/chat` endpoint in `backend/src/api/routes/rag.py` wrapping `rag_chat()`
|
| 61 |
+
- [ ] T012 [P] [US1] Create GET `/api/rag/status` endpoint in `backend/src/api/routes/rag.py` returning index status
|
| 62 |
+
- [ ] T013 [US1] Implement unit tests for `rag_chat()` in `backend/tests/unit/test_rag_service.py`: happy path, no results, error handling
|
| 63 |
+
|
| 64 |
+
### Frontend Implementation (US1+US2)
|
| 65 |
+
|
| 66 |
+
- [ ] T014 [P] [US2] Create RAG API client in `frontend/src/services/rag.ts` with `sendMessage()` and `getStatus()` functions
|
| 67 |
+
- [ ] T015 [P] [US2] Create ChatMessage component in `frontend/src/components/ChatMessage.tsx` rendering user/assistant messages
|
| 68 |
+
- [ ] T016 [P] [US2] Create SourceList component in `frontend/src/components/SourceList.tsx` with collapsible source references
|
| 69 |
+
- [ ] T017 [US1] Create ChatPanel component in `frontend/src/components/ChatPanel.tsx` with message list and composer textarea
|
| 70 |
+
- [ ] T018 [US1] Integrate ChatPanel into MainApp layout in `frontend/src/pages/MainApp.tsx` as new panel/tab
|
| 71 |
+
- [ ] T019 [US2] Wire SourceList click handler to open note in document viewer via existing navigation
|
| 72 |
+
|
| 73 |
+
**Checkpoint**: User can ask a question, see AI answer with sources, and click source to view note
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
## Phase 4: User Story 3 - Multi-Turn Conversation (Priority: P2)
|
| 78 |
+
|
| 79 |
+
**Goal**: Users can have context-aware follow-up conversations
|
| 80 |
+
|
| 81 |
+
**Independent Test**: Ask "What is authentication?", then ask "How do I configure it?" - verify agent understands "it" refers to authentication
|
| 82 |
+
|
| 83 |
+
### Implementation (US3)
|
| 84 |
+
|
| 85 |
+
- [ ] T020 [US3] Add message history state management in `frontend/src/components/ChatPanel.tsx` using React useState
|
| 86 |
+
- [ ] T021 [US3] Pass full message history array to `POST /api/rag/chat` in `frontend/src/services/rag.ts`
|
| 87 |
+
- [ ] T022 [US3] Update `rag_chat()` in `backend/src/services/rag_index.py` to construct context from message history
|
| 88 |
+
- [ ] T023 [US3] Add conversation reset button in `frontend/src/components/ChatPanel.tsx` to clear history
|
| 89 |
+
- [ ] T024 [US3] Add unit test for multi-turn context handling in `backend/tests/unit/test_rag_service.py`
|
| 90 |
+
|
| 91 |
+
**Checkpoint**: Multi-turn conversation maintains context; page refresh clears history
|
| 92 |
+
|
| 93 |
+
---
|
| 94 |
+
|
| 95 |
+
## Phase 5: User Story 4 - Agent Creates Notes (Priority: P3, Optional)
|
| 96 |
+
|
| 97 |
+
**Goal**: Agent can create/append notes in a designated folder
|
| 98 |
+
|
| 99 |
+
**Independent Test**: Ask "create a summary note about authentication" - verify note appears in `agent-notes/` folder
|
| 100 |
+
|
| 101 |
+
### Implementation (US4)
|
| 102 |
+
|
| 103 |
+
- [ ] T025 [US4] Create `create_note()` helper in `backend/src/services/rag_index.py` constrained to `agent-notes/` folder
|
| 104 |
+
- [ ] T026 [US4] Create `append_to_note()` helper in `backend/src/services/rag_index.py` for updating existing notes
|
| 105 |
+
- [ ] T027 [US4] Register helpers as LlamaIndex FunctionTools in `backend/src/services/rag_index.py`
|
| 106 |
+
- [ ] T028 [US4] Update `rag_chat()` to use agent mode with tools when write intent detected
|
| 107 |
+
- [ ] T029 [US4] Add `notes_written` to ChatResponse and include in API response from `backend/src/api/routes/rag.py`
|
| 108 |
+
- [ ] T030 [P] [US4] Add NoteWritten badge component in `frontend/src/components/ChatMessage.tsx` showing created note path
|
| 109 |
+
- [ ] T031 [US4] Wire badge click to navigate to created note in vault viewer
|
| 110 |
+
- [ ] T032 [US4] Add unit tests for constrained write operations in `backend/tests/unit/test_rag_service.py`
|
| 111 |
+
|
| 112 |
+
**Checkpoint**: Agent can create notes; writes constrained to `agent-notes/` folder
|
| 113 |
+
|
| 114 |
+
---
|
| 115 |
+
|
| 116 |
+
## Phase 6: Polish & Cross-Cutting Concerns
|
| 117 |
+
|
| 118 |
+
**Purpose**: Error handling, edge cases, and validation
|
| 119 |
+
|
| 120 |
+
- [ ] T033 [P] Implement error handling for missing GOOGLE_API_KEY with 503 response in `backend/src/services/rag_index.py`
|
| 121 |
+
- [ ] T034 [P] Implement error handling for API rate limits with 429 response in `backend/src/api/routes/rag.py`
|
| 122 |
+
- [ ] T035 [P] Add loading state and error display in `frontend/src/components/ChatPanel.tsx`
|
| 123 |
+
- [ ] T036 [P] Add empty vault message when no documents indexed in `backend/src/services/rag_index.py`
|
| 124 |
+
- [ ] T037 Run quickstart.md validation: verify all setup steps work
|
| 125 |
+
- [ ] T038 Manual E2E test: full user journey through all implemented stories
|
| 126 |
+
|
| 127 |
+
---
|
| 128 |
+
|
| 129 |
+
## Dependencies & Execution Order
|
| 130 |
+
|
| 131 |
+
### Phase Dependencies
|
| 132 |
+
|
| 133 |
+
- **Setup (Phase 1)**: No dependencies - can start immediately
|
| 134 |
+
- **Foundational (Phase 2)**: Depends on Setup completion - BLOCKS all user stories
|
| 135 |
+
- **User Stories (Phase 3-5)**: All depend on Foundational phase completion
|
| 136 |
+
- US1+US2 (Phase 3) must complete before US3 (Phase 4)
|
| 137 |
+
- US3 can complete before US4 (Phase 5 is optional)
|
| 138 |
+
- **Polish (Phase 6)**: Can run after Phase 3 minimum
|
| 139 |
+
|
| 140 |
+
### User Story Dependencies
|
| 141 |
+
|
| 142 |
+
- **User Story 1+2 (P1)**: Can start after Foundational (Phase 2) - No dependencies on other stories
|
| 143 |
+
- **User Story 3 (P2)**: Depends on US1+US2 for chat panel and history structure
|
| 144 |
+
- **User Story 4 (P3)**: Depends on US1+US2 for basic chat flow; optional feature
|
| 145 |
+
|
| 146 |
+
### Within Each Phase
|
| 147 |
+
|
| 148 |
+
- Backend models before services
|
| 149 |
+
- Services before routes
|
| 150 |
+
- Backend before frontend integration
|
| 151 |
+
- Core implementation before error handling
|
| 152 |
+
|
| 153 |
+
### Parallel Opportunities
|
| 154 |
+
|
| 155 |
+
**Phase 1:**
|
| 156 |
+
- T002 (TS types) and T003 (config) can run in parallel
|
| 157 |
+
|
| 158 |
+
**Phase 2:**
|
| 159 |
+
- T004 (models) must complete before T005-T008
|
| 160 |
+
|
| 161 |
+
**Phase 3:**
|
| 162 |
+
- T014, T015, T016 (frontend components) can run in parallel
|
| 163 |
+
- T012 (/status endpoint) can run in parallel with other backend work
|
| 164 |
+
|
| 165 |
+
**Phase 4:**
|
| 166 |
+
- T020-T024 are sequential (frontend then backend integration)
|
| 167 |
+
|
| 168 |
+
**Phase 5:**
|
| 169 |
+
- T025, T026 (helpers) sequential
|
| 170 |
+
- T030 (badge) can run parallel with backend once T029 complete
|
| 171 |
+
|
| 172 |
+
**Phase 6:**
|
| 173 |
+
- T033, T034, T035, T036 all parallel (different files)
|
| 174 |
+
|
| 175 |
+
---
|
| 176 |
+
|
| 177 |
+
## Parallel Example: Phase 3 Frontend
|
| 178 |
+
|
| 179 |
+
```bash
|
| 180 |
+
# Launch all independent frontend components together:
|
| 181 |
+
Task: "Create RAG API client in frontend/src/services/rag.ts"
|
| 182 |
+
Task: "Create ChatMessage component in frontend/src/components/ChatMessage.tsx"
|
| 183 |
+
Task: "Create SourceList component in frontend/src/components/SourceList.tsx"
|
| 184 |
+
```
|
| 185 |
+
|
| 186 |
+
---
|
| 187 |
+
|
| 188 |
+
## Implementation Strategy
|
| 189 |
+
|
| 190 |
+
### MVP First (Phase 1-3 Only)
|
| 191 |
+
|
| 192 |
+
1. Complete Phase 1: Setup (T001-T003)
|
| 193 |
+
2. Complete Phase 2: Foundational (T004-T008)
|
| 194 |
+
3. Complete Phase 3: User Story 1+2 (T009-T019)
|
| 195 |
+
4. **STOP and VALIDATE**: Test RAG query and source display independently
|
| 196 |
+
5. Deploy/demo if ready - this is the MVP!
|
| 197 |
+
|
| 198 |
+
### Incremental Delivery
|
| 199 |
+
|
| 200 |
+
1. Complete Setup + Foundational β Foundation ready
|
| 201 |
+
2. Add US1+US2 β Test independently β **Deploy/Demo (MVP!)**
|
| 202 |
+
3. Add US3 β Test multi-turn β Deploy/Demo
|
| 203 |
+
4. Add US4 (optional) β Test note creation β Deploy/Demo
|
| 204 |
+
5. Each story adds value without breaking previous stories
|
| 205 |
+
|
| 206 |
+
### Estimated Effort
|
| 207 |
+
|
| 208 |
+
| Phase | Tasks | Estimated Hours |
|
| 209 |
+
|-------|-------|-----------------|
|
| 210 |
+
| Setup | 3 | 0.5 |
|
| 211 |
+
| Foundational | 5 | 2 |
|
| 212 |
+
| US1+US2 (MVP) | 11 | 4 |
|
| 213 |
+
| US3 | 5 | 2 |
|
| 214 |
+
| US4 (optional) | 8 | 3 |
|
| 215 |
+
| Polish | 6 | 1.5 |
|
| 216 |
+
| **Total** | **38** | **13** |
|
| 217 |
+
|
| 218 |
+
---
|
| 219 |
+
|
| 220 |
+
## Notes
|
| 221 |
+
|
| 222 |
+
- [P] tasks = different files, no dependencies
|
| 223 |
+
- [Story] label maps task to specific user story for traceability
|
| 224 |
+
- US1 and US2 combined in Phase 3 since they're tightly coupled (source display is part of query response)
|
| 225 |
+
- US4 is optional per spec - can skip if time-constrained
|
| 226 |
+
- Constitution requires pytest tests for backend features
|
| 227 |
+
- Frontend testing is manual verification per Constitution
|
| 228 |
+
|