Spaces:
Running
Implementation Plan: Gemini Vault Chat Agent
Branch: 004-gemini-vault-chat | Date: 2025-11-28 | Spec: spec.md
Input: Feature specification from /specs/004-gemini-vault-chat/spec.md
Summary
Add a Gemini-powered RAG chat agent to the Document-MCP platform. Users can ask natural language questions about their Markdown vault and receive AI-synthesized answers grounded in their documents. The system uses LlamaIndex for document indexing and retrieval, with Gemini as both the LLM and embedding model. An optional Phase 2 adds constrained note-writing capabilities.
Technical Context
Language/Version: Python 3.11+ (backend), TypeScript (frontend)
Primary Dependencies: FastAPI, LlamaIndex, llama-index-llms-google-genai, llama-index-embeddings-google-genai, React 18+, Tailwind CSS, Shadcn/UI
Storage: Filesystem vault (existing), LlamaIndex persisted vector store (new, under data/llamaindex/)
Testing: pytest (backend), manual verification (frontend)
Target Platform: Hugging Face Spaces (Docker), Linux server
Project Type: Web application (frontend + backend)
Performance Goals: <5 seconds for RAG response (per SC-001)
Constraints: Must not break existing MCP server or ChatGPT widget
Scale/Scope: Hackathon scaleβindex rebuilds acceptable on restart
Constitution Check
GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.
| Principle | Status | Notes |
|---|---|---|
| I. Brownfield Integration | β Pass | Uses existing VaultService, adds new routes/services alongside existing code |
| II. Test-Backed Development | β Pass | Plan includes pytest tests for RAG service; frontend is manual verification |
| III. Incremental Delivery | β Pass | P1 stories (read-only RAG) can ship before P3 (write tools) |
| IV. Specification-Driven | β Pass | All work traced to spec.md; Phase 2 is optional per spec |
| No Magic | β Pass | Direct LlamaIndex usage, no custom abstractions |
| Single Source of Truth | β Pass | Vault remains source of truth; index is derived view |
| Error Handling | β Pass | Spec requires FR-011 error messages for AI unavailability |
Technology Stack Compliance:
- Backend: Python 3.11+, FastAPI, Pydantic β
- Frontend: React 18+, TypeScript, Tailwind, Shadcn/UI β
- Storage: Filesystem-based (LlamaIndex persisted store) β
Project Structure
Documentation (this feature)
specs/004-gemini-vault-chat/
βββ plan.md # This file
βββ research.md # Phase 0 output
βββ data-model.md # Phase 1 output
βββ quickstart.md # Phase 1 output
βββ contracts/ # Phase 1 output
β βββ rag-api.yaml # OpenAPI spec for RAG endpoints
βββ tasks.md # Phase 2 output (created by /speckit.tasks)
Source Code (repository root)
backend/
βββ src/
β βββ api/
β β βββ routes/
β β βββ rag.py # NEW: RAG chat endpoint
β βββ models/
β β βββ rag.py # NEW: Pydantic models for RAG
β βββ services/
β βββ rag_index.py # NEW: LlamaIndex service
βββ tests/
βββ unit/
βββ test_rag_service.py # NEW: RAG service tests
frontend/
βββ src/
β βββ components/
β β βββ ChatPanel.tsx # NEW: Chat interface
β β βββ ChatMessage.tsx # NEW: Message component
β β βββ SourceList.tsx # NEW: Source references
β βββ services/
β β βββ rag.ts # NEW: RAG API client
β βββ types/
β βββ rag.ts # NEW: TypeScript types
data/
βββ llamaindex/ # NEW: Persisted vector index
Structure Decision: Web application structure (Option 2). New files added alongside existing code per Constitution Principle I.
Complexity Tracking
No violations requiring justification.
Implementation Phases
Phase 1: Core RAG Query (P1 Stories)
Implements User Stories 1-2: Ask questions, view sources.
Backend Tasks:
- Add LlamaIndex dependencies to
requirements.txt - Create
rag_index.pyservice withget_or_build_index()singleton - Create
rag.pyPydantic models for request/response - Create
rag.pyroute withPOST /api/rag/chatendpoint - Add unit tests for RAG service
Frontend Tasks:
- Create
ChatPanel.tsxcomponent with message list and composer - Create
ChatMessage.tsxfor rendering user/assistant messages - Create
SourceList.tsxfor collapsible source references - Add
rag.tsAPI client service - Integrate ChatPanel into MainApp layout
Phase 2: Multi-Turn Conversation (P2 Story)
Implements User Story 3: Context-aware follow-ups.
Tasks:
- Maintain chat history in frontend state
- Pass full message history to backend
- Update RAG service to use chat history for context
Phase 3: Agent Note Writing (P3 Story, Optional)
Implements User Story 4: Create/append notes via agent.
Tasks:
- Create constrained write helpers (
create_note,append_to_note) - Register as LlamaIndex agent tools
- Add
notes_writtento response model - Show created notes badge in UI