bigwolfe
init
05c9551

Implementation Plan: Gemini Vault Chat Agent

Branch: 004-gemini-vault-chat | Date: 2025-11-28 | Spec: spec.md
Input: Feature specification from /specs/004-gemini-vault-chat/spec.md

Summary

Add a Gemini-powered RAG chat agent to the Document-MCP platform. Users can ask natural language questions about their Markdown vault and receive AI-synthesized answers grounded in their documents. The system uses LlamaIndex for document indexing and retrieval, with Gemini as both the LLM and embedding model. An optional Phase 2 adds constrained note-writing capabilities.

Technical Context

Language/Version: Python 3.11+ (backend), TypeScript (frontend)
Primary Dependencies: FastAPI, LlamaIndex, llama-index-llms-google-genai, llama-index-embeddings-google-genai, React 18+, Tailwind CSS, Shadcn/UI
Storage: Filesystem vault (existing), LlamaIndex persisted vector store (new, under data/llamaindex/)
Testing: pytest (backend), manual verification (frontend)
Target Platform: Hugging Face Spaces (Docker), Linux server
Project Type: Web application (frontend + backend)
Performance Goals: <5 seconds for RAG response (per SC-001)
Constraints: Must not break existing MCP server or ChatGPT widget
Scale/Scope: Hackathon scaleβ€”index rebuilds acceptable on restart

Constitution Check

GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.

Principle Status Notes
I. Brownfield Integration βœ… Pass Uses existing VaultService, adds new routes/services alongside existing code
II. Test-Backed Development βœ… Pass Plan includes pytest tests for RAG service; frontend is manual verification
III. Incremental Delivery βœ… Pass P1 stories (read-only RAG) can ship before P3 (write tools)
IV. Specification-Driven βœ… Pass All work traced to spec.md; Phase 2 is optional per spec
No Magic βœ… Pass Direct LlamaIndex usage, no custom abstractions
Single Source of Truth βœ… Pass Vault remains source of truth; index is derived view
Error Handling βœ… Pass Spec requires FR-011 error messages for AI unavailability

Technology Stack Compliance:

  • Backend: Python 3.11+, FastAPI, Pydantic βœ…
  • Frontend: React 18+, TypeScript, Tailwind, Shadcn/UI βœ…
  • Storage: Filesystem-based (LlamaIndex persisted store) βœ…

Project Structure

Documentation (this feature)

specs/004-gemini-vault-chat/
β”œβ”€β”€ plan.md              # This file
β”œβ”€β”€ research.md          # Phase 0 output
β”œβ”€β”€ data-model.md        # Phase 1 output
β”œβ”€β”€ quickstart.md        # Phase 1 output
β”œβ”€β”€ contracts/           # Phase 1 output
β”‚   └── rag-api.yaml     # OpenAPI spec for RAG endpoints
└── tasks.md             # Phase 2 output (created by /speckit.tasks)

Source Code (repository root)

backend/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   └── routes/
β”‚   β”‚       └── rag.py           # NEW: RAG chat endpoint
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   └── rag.py               # NEW: Pydantic models for RAG
β”‚   └── services/
β”‚       └── rag_index.py         # NEW: LlamaIndex service
└── tests/
    └── unit/
        └── test_rag_service.py  # NEW: RAG service tests

frontend/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ components/
β”‚   β”‚   β”œβ”€β”€ ChatPanel.tsx        # NEW: Chat interface
β”‚   β”‚   β”œβ”€β”€ ChatMessage.tsx      # NEW: Message component
β”‚   β”‚   └── SourceList.tsx       # NEW: Source references
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   └── rag.ts               # NEW: RAG API client
β”‚   └── types/
β”‚       └── rag.ts               # NEW: TypeScript types

data/
└── llamaindex/                  # NEW: Persisted vector index

Structure Decision: Web application structure (Option 2). New files added alongside existing code per Constitution Principle I.

Complexity Tracking

No violations requiring justification.

Implementation Phases

Phase 1: Core RAG Query (P1 Stories)

Implements User Stories 1-2: Ask questions, view sources.

Backend Tasks:

  1. Add LlamaIndex dependencies to requirements.txt
  2. Create rag_index.py service with get_or_build_index() singleton
  3. Create rag.py Pydantic models for request/response
  4. Create rag.py route with POST /api/rag/chat endpoint
  5. Add unit tests for RAG service

Frontend Tasks:

  1. Create ChatPanel.tsx component with message list and composer
  2. Create ChatMessage.tsx for rendering user/assistant messages
  3. Create SourceList.tsx for collapsible source references
  4. Add rag.ts API client service
  5. Integrate ChatPanel into MainApp layout

Phase 2: Multi-Turn Conversation (P2 Story)

Implements User Story 3: Context-aware follow-ups.

Tasks:

  1. Maintain chat history in frontend state
  2. Pass full message history to backend
  3. Update RAG service to use chat history for context

Phase 3: Agent Note Writing (P3 Story, Optional)

Implements User Story 4: Create/append notes via agent.

Tasks:

  1. Create constrained write helpers (create_note, append_to_note)
  2. Register as LlamaIndex agent tools
  3. Add notes_written to response model
  4. Show created notes badge in UI