Kareman commited on
Commit
b5739f3
·
0 Parent(s):
Files changed (10) hide show
  1. .gitignore +48 -0
  2. Dockerfile +34 -0
  3. README.md +141 -0
  4. app/langgraph_flow.py +60 -0
  5. app/main.py +28 -0
  6. app/models.py +19 -0
  7. app/recommender.py +82 -0
  8. app/utils.py +49 -0
  9. prepare_data.py +108 -0
  10. requirements.txt +13 -0
.gitignore ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # --- Python ---
2
+ __pycache__/
3
+ *.py[cod]
4
+ *.pyo
5
+ *.pyd
6
+ *.so
7
+ *.egg
8
+ *.egg-info/
9
+ dist/
10
+ build/
11
+ .eggs/
12
+
13
+ # --- Virtual environments ---
14
+ .venv/
15
+ venv/
16
+ env/
17
+ ENV/
18
+ *.env
19
+ .env.*
20
+
21
+ # --- Jupyter / notebooks ---
22
+ .ipynb_checkpoints
23
+ *.ipynb
24
+
25
+ # --- OS / Editor files ---
26
+ .DS_Store
27
+ Thumbs.db
28
+ .idea/
29
+ .vscode/
30
+
31
+ # --- Logs / Caches ---
32
+ *.log
33
+ *.out
34
+ *.err
35
+ *.sqlite3
36
+ .cache/
37
+ .mypy_cache/
38
+ .pytest_cache/
39
+ coverage/
40
+ htmlcov/
41
+
42
+ # --- FAISS / Embedding intermediate dumps ---
43
+ *.npy
44
+
45
+ # --- Project specific ---
46
+ # Keep data/ and faiss_index/ in git, but ignore temporary stuff inside them
47
+ data/*
48
+ faiss_index/*
Dockerfile ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ---- Base ----
2
+ FROM python:3.10-slim
3
+
4
+ # Set workdir
5
+ WORKDIR /app
6
+
7
+ # Install system dependencies
8
+ RUN apt-get update && apt-get install -y \
9
+ git \
10
+ build-essential \
11
+ && rm -rf /var/lib/apt/lists/*
12
+
13
+ # Copy project files (including data/ and faiss_index/)
14
+ COPY . /app
15
+
16
+ # Upgrade pip and install dependencies
17
+ RUN pip install --upgrade pip
18
+ RUN pip install -r requirements.txt
19
+
20
+ # ---- Pre-download MiniLM embeddings at build time ----
21
+ # The model will be stored in the default Hugging Face cache (~/.cache/huggingface)
22
+ RUN python -c "from langchain_huggingface import HuggingFaceEmbeddings; HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2')"
23
+
24
+ # ---- Copy FAISS index to /tmp at runtime ----
25
+ # We'll copy them from /app/faiss_index in CMD, since /tmp is the only writable location in Spaces
26
+ # We will do this in an entrypoint script
27
+ COPY entrypoint.sh /app/entrypoint.sh
28
+ RUN chmod +x /app/entrypoint.sh
29
+
30
+ # Expose port
31
+ EXPOSE 8000
32
+
33
+ # Run entrypoint
34
+ CMD ["/app/entrypoint.sh"]
README.md ADDED
@@ -0,0 +1,141 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🎬 Movie Recommender System (FastAPI + LangGraph + FAISS)
2
+
3
+ This project is an AI-powered **movie recommender system**.
4
+ It uses **FAISS vector search**, **local embeddings**, and **LLMs (via OpenRouter)** to recommend movies in **any language**.
5
+
6
+ The pipeline:
7
+ 1. Detects the language of the user query.
8
+ 2. Translates the query into English.
9
+ 3. Retrieves similar movies using embeddings + FAISS.
10
+ 4. Generates natural language explanations with an LLM.
11
+ 5. Translates the explanations back into the user’s language.
12
+
13
+ ---
14
+
15
+ ## ✨ Features
16
+ - Multilingual support (query in any language 🌍).
17
+ - Fast similarity search with **FAISS**.
18
+ - Local embeddings with [MiniLM](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2).
19
+ - Explanations powered by **OpenRouter LLMs**.
20
+ - Modular pipeline built with **LangGraph**.
21
+
22
+ ---
23
+
24
+ ## 🛠️ Tech Stack
25
+ - **Backend**: FastAPI
26
+ - **Vector DB**: FAISS
27
+ - **Embeddings**: `sentence-transformers/all-MiniLM-L6-v2` (local)
28
+ - **Orchestration**: LangChain + LangGraph
29
+ - **LLM**: OpenRouter (Meta LLaMA Scout free by default)
30
+ - **Deployment**: Docker / Hugging Face Spaces
31
+
32
+ ---
33
+
34
+ ## 📂 Project Structure
35
+ ```
36
+ .
37
+ ├── app/
38
+ │ ├── main.py # FastAPI entry point
39
+ │ ├── recommender.py # Core recommender logic
40
+ │ ├── graph.py # LangGraph workflow
41
+ │ └── utils.py # Helper functions
42
+ ├── data/ # Movies dataset
43
+ ├── faiss_index/ # Prebuilt FAISS index + metadata
44
+ ├── prepare_data.py # Script to build FAISS index
45
+ ├── requirements.txt
46
+ ├── .env # API keys (not committed)
47
+ ├── .gitignore
48
+ └── README.md
49
+ ```
50
+
51
+ ---
52
+
53
+ ## 🚀 Getting Started
54
+
55
+ ### 1. Clone & Setup
56
+ ```bash
57
+ git clone https://github.com/your-username/movie-recommender.git
58
+ cd movie-recommender
59
+
60
+ # Create virtual environment
61
+ python -m venv .venv
62
+ source .venv/bin/activate
63
+
64
+ # Install dependencies
65
+ pip install -r requirements.txt
66
+ ```
67
+
68
+ ### 2. Environment Variables
69
+ Create a `.env` file in the project root:
70
+ ```ini
71
+ OPENROUTER=your_openrouter_api_key
72
+ ```
73
+
74
+ ### 3. Prepare FAISS Index
75
+ If not already included:
76
+ ```bash
77
+ python prepare_data.py
78
+ ```
79
+
80
+ This builds:
81
+ - `faiss_index/movies_index.faiss`
82
+ - `faiss_index/movies.pkl`
83
+
84
+ ### 4. Run FastAPI App
85
+ ```bash
86
+ uvicorn app.main:app --reload
87
+ ```
88
+
89
+ Backend will start at:
90
+ 👉 http://127.0.0.1:8000
91
+
92
+ Interactive API docs at:
93
+ 👉 http://127.0.0.1:8000/docs
94
+
95
+ ---
96
+
97
+ ## 📌 Example Usage
98
+
99
+ ### Request
100
+ ```bash
101
+ curl -X POST http://127.0.0.1:8000/recommend -H "Content-Type: application/json" -d '{"query": "لطفا یک فیلم فانتزی هیجان انگیز شاد بهم معرفی کن", "k": 5}'
102
+ ```
103
+
104
+ ### Response
105
+ ```json
106
+ [
107
+ {
108
+ "title": "The Incredibles",
109
+ "genres": "Action|Animation|Adventure",
110
+ "overview": "A family of superheroes...",
111
+ "explanation": "این فیلم یک ماجراجویی شاد و هیجان‌انگیز است که با درخواست شما مطابقت دارد."
112
+ },
113
+ ...
114
+ ]
115
+ ```
116
+
117
+ ---
118
+
119
+ ## 🐳 Deployment with Docker
120
+ Build and run locally:
121
+ ```bash
122
+ docker build -t movie-recommender .
123
+ docker run -p 8000:8000 movie-recommender
124
+ ```
125
+
126
+ For Hugging Face Spaces:
127
+ - Only `/tmp` is writable at runtime.
128
+ - Pre-download embeddings + FAISS index during build.
129
+
130
+ ---
131
+
132
+ ## 🧩 Next Steps
133
+ - Add **user profiles** for personalized recommendations.
134
+ - Support **hybrid search** (metadata + embeddings).
135
+ - Add **Next.js frontend** for a full-stack app.
136
+ - Deploy to **Hugging Face Spaces**.
137
+
138
+ ---
139
+
140
+ ## 📜 License
141
+ MIT License. Free to use & modify.
app/langgraph_flow.py ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import TypedDict, List, Dict, Optional
2
+ from langgraph.graph import StateGraph, END
3
+ from langchain.schema import Document
4
+
5
+
6
+ from typing import TypedDict, List, Dict, Optional
7
+ from langchain.schema import Document
8
+
9
+ class State(TypedDict):
10
+ query: str # user query (any language)
11
+ user_lang: str # detected language (e.g., "es")
12
+ k: int # ✅ add this line
13
+ translated_query: Optional[str] # query in English
14
+ docs: Optional[List[Document]]
15
+ recommendations: Optional[List[Dict]]
16
+
17
+ from langgraph.graph import StateGraph, END
18
+
19
+ def build_graph(recommender):
20
+ graph = StateGraph(State)
21
+
22
+ # Stage 1: Detect + translate query
23
+ def translate_in(state: State):
24
+ user_lang = recommender.detect_language(state["query"])
25
+ translated_query = state["query"]
26
+ if user_lang != "en":
27
+ translated_query = recommender.translate(state["query"], "en")
28
+ return {"user_lang": user_lang, "translated_query": translated_query}
29
+
30
+ # Stage 2: Retrieval
31
+ def retrieve(state: State):
32
+ docs = recommender.search(state["translated_query"], k=state["k"] * 2)
33
+ return {"docs": docs}
34
+
35
+ # Stage 3: Explanation (in English)
36
+ def explain(state: State):
37
+ recs = recommender.explain(state["translated_query"], state["docs"][: state["k"]], user_lang="en")
38
+ return {"recommendations": recs}
39
+
40
+ # Stage 4: Translate explanations back
41
+ def translate_out(state: State):
42
+ if state["user_lang"] != "en":
43
+ for r in state["recommendations"]:
44
+ r["explanation"] = recommender.translate(r["explanation"], state["user_lang"])
45
+ return {"recommendations": state["recommendations"]}
46
+
47
+ # Build graph
48
+ graph.add_node("translate_in", translate_in)
49
+ graph.add_node("retrieve", retrieve)
50
+ graph.add_node("explain", explain)
51
+ graph.add_node("translate_out", translate_out)
52
+
53
+ graph.set_entry_point("translate_in")
54
+ graph.add_edge("translate_in", "retrieve")
55
+ graph.add_edge("retrieve", "explain")
56
+ graph.add_edge("explain", "translate_out")
57
+ graph.add_edge("translate_out", END)
58
+
59
+ return graph.compile()
60
+
app/main.py ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import FastAPI
2
+ from fastapi.middleware.cors import CORSMiddleware
3
+ from app.models import RecommendRequest, RecommendResponse
4
+ from app.recommender import Recommender
5
+ from app.langgraph_flow import build_graph
6
+
7
+ app = FastAPI(title="Movie Recommender")
8
+
9
+ app.add_middleware(
10
+ CORSMiddleware,
11
+ allow_origins=["*"], # later restrict to frontend domain
12
+ allow_credentials=True,
13
+ allow_methods=["*"],
14
+ allow_headers=["*"],
15
+ )
16
+
17
+ recommender = Recommender()
18
+ graph = build_graph(recommender)
19
+
20
+ @app.post("/recommend", response_model=RecommendResponse)
21
+ async def recommend(req: RecommendRequest):
22
+ state = {"query": req.query, "k": req.k}
23
+ result = graph.invoke(state)
24
+ return {"recommendations": result["recommendations"]}
25
+
26
+ @app.get("/health")
27
+ async def health():
28
+ return {"status": "ok"}
app/models.py ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pydantic import BaseModel
2
+ from typing import List, Optional
3
+
4
+ class RecommendRequest(BaseModel):
5
+ query: str
6
+ k: Optional[int] = 5
7
+
8
+ class Recommendation(BaseModel):
9
+ title: str
10
+ genres: str
11
+ overview: str
12
+ director: Optional[str]
13
+ cast: Optional[str]
14
+ release_date: Optional[str]
15
+ vote_average: Optional[float]
16
+ explanation: str
17
+
18
+ class RecommendResponse(BaseModel):
19
+ recommendations: List[Recommendation]
app/recommender.py ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from langchain_community.vectorstores import FAISS
3
+ from langchain_huggingface import HuggingFaceEmbeddings
4
+ from langchain_openai import ChatOpenAI
5
+ from langdetect import detect
6
+ from dotenv import load_dotenv
7
+
8
+ load_dotenv() # loads .env into os.environ
9
+
10
+
11
+ class Recommender:
12
+ def __init__(self, index_dir="faiss_index"):
13
+ # ✅ Embeddings (English only)
14
+ self.embeddings = HuggingFaceEmbeddings(
15
+ model_name="sentence-transformers/all-MiniLM-L6-v2"
16
+ )
17
+ self.db = FAISS.load_local(
18
+ index_dir, self.embeddings, allow_dangerous_deserialization=True
19
+ )
20
+
21
+ # ✅ OpenRouter LLM (used for explanations + translation)
22
+ self.llmExplanation = ChatOpenAI(
23
+ openai_api_key=os.environ["OPENROUTER"],
24
+ openai_api_base="https://openrouter.ai/api/v1",
25
+ model="meta-llama/llama-4-scout:free",
26
+ temperature=0,
27
+ max_tokens=512,
28
+ )
29
+ self.llmTranslation = ChatOpenAI(
30
+ openai_api_key=os.environ["OPENROUTER"],
31
+ openai_api_base="https://openrouter.ai/api/v1",
32
+ model="meta-llama/llama-4-scout:free", # switch here
33
+ temperature=0,
34
+ max_tokens=512
35
+ )
36
+
37
+ # 🔹 Stage 1a: Language detection
38
+ def detect_language(self, text: str) -> str:
39
+ return detect(text)
40
+
41
+ # 🔹 Stage 1b + 4: Translation (to/from English)
42
+ def translate(self, text: str, target_lang: str = "en") -> str:
43
+ prompt = f"Translate this text into {target_lang}: {text}"
44
+ return self.llmTranslation.invoke(prompt).content
45
+
46
+ # 🔹 Stage 2: Retrieval
47
+ def search(self, query: str, k: int = 10):
48
+ return self.db.similarity_search(query, k=k)
49
+
50
+ # 🔹 Stage 3: Explanation (always in English)
51
+ def explain(self, query: str, docs, user_lang="en"):
52
+ results = []
53
+ for d in docs:
54
+ prompt = (
55
+ f"User request: {query}\n"
56
+ f"Candidate movie: {d.metadata['title']} "
57
+ f"({d.metadata.get('genres')}).\n"
58
+ f"Overview: {d.metadata.get('overview')}\n\n"
59
+ "Explain in one sentence why this movie could be a good recommendation "
60
+ "for the user’s request. Focus only on positive connections."
61
+ )
62
+ response = self.llmExplanation.invoke(prompt).content
63
+
64
+ results.append({
65
+ "title": d.metadata["title"],
66
+ "genres": d.metadata["genres"],
67
+ "overview": d.metadata["overview"],
68
+ "director": d.metadata.get("director"),
69
+ "cast": d.metadata.get("cast"),
70
+ "release_date": d.metadata.get("release_date"),
71
+ "vote_average": d.metadata.get("vote_average"),
72
+ "explanation": response, # always English at this stage
73
+ })
74
+ return results
75
+
76
+
77
+
78
+
79
+
80
+
81
+
82
+
app/utils.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from transformers import AutoModel, AutoTokenizer
3
+ import torch
4
+ from dotenv import load_dotenv
5
+ from langchain.schema.embeddings import Embeddings
6
+
7
+
8
+ load_dotenv() # ✅ make sure .env is read
9
+
10
+ class GemmaEmbeddings:
11
+ def __init__(self, model_name="google/embeddinggemma-300m", device=None):
12
+ self.device = device or ("cuda" if torch.cuda.is_available() else "cpu")
13
+
14
+ hf_token = os.environ.get("HUGGINGFACETOEN")
15
+ if not hf_token:
16
+ raise ValueError("❌ Hugging Face token not found. Please set HF_TOKEN in .env")
17
+
18
+ # ✅ Pass token when loading
19
+ self.tokenizer = AutoTokenizer.from_pretrained(model_name, use_auth_token=hf_token)
20
+ self.model = AutoModel.from_pretrained(model_name, use_auth_token=hf_token).to(self.device)
21
+
22
+ def embed(self, texts):
23
+ if isinstance(texts, str):
24
+ texts = [texts]
25
+
26
+ encodings = self.tokenizer(
27
+ texts, padding=True, truncation=True, return_tensors="pt"
28
+ ).to(self.device)
29
+
30
+ with torch.no_grad():
31
+ model_output = self.model(**encodings)
32
+
33
+ embeddings = model_output.last_hidden_state.mean(dim=1).cpu().numpy()
34
+ return embeddings.tolist()
35
+
36
+
37
+
38
+ class GemmaLangChainEmbeddings(Embeddings):
39
+ def __init__(self, model_name="google/embeddinggemma-300m"):
40
+ self.gemma = GemmaEmbeddings(model_name=model_name)
41
+
42
+ def embed_query(self, text: str):
43
+ return self.gemma.embed(text)[0]
44
+
45
+ def embed_documents(self, texts: list[str]):
46
+ return self.gemma.embed(texts)
47
+
48
+
49
+
prepare_data.py ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ '''
2
+ import pandas as pd
3
+ import numpy as np
4
+ import faiss, pickle, os
5
+ from app.utils import GemmaEmbeddings
6
+
7
+ def build_index(
8
+ csv_path="data/movies.csv",
9
+ out_dir="faiss_index",
10
+ batch_size=32,
11
+ checkpoint_size=1000
12
+ ):
13
+ df = pd.read_csv(csv_path)
14
+ texts = df["overview"].fillna("").tolist()
15
+ total = len(texts)
16
+
17
+ os.makedirs(out_dir, exist_ok=True)
18
+ embedder = GemmaEmbeddings()
19
+
20
+ embeddings = []
21
+ start_idx = 0
22
+
23
+ # 🔹 Check for existing partial progress
24
+ checkpoint_file = f"{out_dir}/progress.pkl"
25
+ if os.path.exists(checkpoint_file):
26
+ with open(checkpoint_file, "rb") as f:
27
+ saved = pickle.load(f)
28
+ embeddings = saved["embeddings"]
29
+ start_idx = saved["next_idx"]
30
+ print(f"🔄 Resuming from index {start_idx}")
31
+
32
+ # 🔹 Process in batches
33
+ for i in range(start_idx, total, batch_size):
34
+ batch = texts[i:i+batch_size]
35
+ vectors = embedder.embed(batch)
36
+ embeddings.extend(vectors)
37
+ print(f"✅ Processed {i+len(batch)} / {total}")
38
+
39
+ # Save checkpoint every `checkpoint_size`
40
+ if (i + batch_size) % (10*batch_size) == 0 or (i + batch_size) >= total:
41
+ with open(checkpoint_file, "wb") as f:
42
+ pickle.dump({
43
+ "embeddings": embeddings,
44
+ "next_idx": i + batch_size
45
+ }, f)
46
+ print(f"💾 Saved checkpoint at {i+batch_size}")
47
+
48
+ # 🔹 Build FAISS index at the end
49
+ embeddings = np.array(embeddings).astype("float32")
50
+ dim = embeddings.shape[1]
51
+ index = faiss.IndexFlatL2(dim)
52
+ index.add(embeddings)
53
+
54
+ faiss.write_index(index, f"{out_dir}/movies_index.faiss")
55
+ with open(f"{out_dir}/movies.pkl", "wb") as f:
56
+ pickle.dump(df.to_dict(orient="records"), f)
57
+
58
+ # Remove checkpoint after success
59
+ if os.path.exists(checkpoint_file):
60
+ os.remove(checkpoint_file)
61
+ print("🎉 Index built successfully!")
62
+
63
+ if __name__ == "__main__":
64
+ build_index()
65
+ '''
66
+
67
+
68
+ import os
69
+ import pandas as pd
70
+ from langchain_community.vectorstores import FAISS
71
+ from langchain_community.embeddings import HuggingFaceEmbeddings
72
+
73
+ def build_faiss(csv_path="data/movies.csv", out_dir="faiss_index"):
74
+ df = pd.read_csv(csv_path).fillna("")
75
+
76
+ texts, metadatas = [], []
77
+ for _, row in df.iterrows():
78
+ text = (
79
+ f"Title: {row['title']}.\n"
80
+ f"Overview: {row['overview']}.\n"
81
+ f"Genres: {row['genres']}.\n"
82
+ f"Director: {row['director']}.\n"
83
+ f"Cast: {row['cast']}."
84
+ )
85
+ texts.append(text)
86
+ metadatas.append({
87
+ "id": row["id"],
88
+ "title": row["title"],
89
+ "genres": row["genres"],
90
+ "overview": row["overview"],
91
+ "director": row["director"],
92
+ "cast": row["cast"],
93
+ "release_date": row["release_date"],
94
+ "vote_average": row["vote_average"],
95
+ "popularity": row["popularity"]
96
+ })
97
+
98
+ # ✅ Use local MiniLM embeddings
99
+ embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
100
+
101
+ db = FAISS.from_texts(texts, embeddings, metadatas=metadatas)
102
+ os.makedirs(out_dir, exist_ok=True)
103
+ db.save_local(out_dir)
104
+ print(f"✅ Saved FAISS index with {len(df)} movies to {out_dir}")
105
+
106
+ if __name__ == "__main__":
107
+ build_faiss("data/movies.csv")
108
+
requirements.txt ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ fastapi==0.117.1
2
+ uvicorn==0.37.0
3
+ pandas==2.3.2
4
+ faiss-cpu==1.7.4
5
+ langchain==0.3.27
6
+ langchain-community==0.3.30
7
+ langchain-openai==0.3.33
8
+ sentence-transformers==5.1.1
9
+ python-dotenv==1.1.1
10
+ numpy==1.26.4
11
+ langchain_huggingface==0.3.1
12
+ langgraph==0.6.7
13
+ langdetect==1.0.9