--- title: SBERT + FAISS Semantic Search emoji: 🔍 colorFrom: blue colorTo: green sdk: gradio sdk_version: 5.45.0 app_file: app.py pinned: false --- # SBERT + FAISS Semantic Search + Evaluation Metrics This Hugging Face Space hosts a **semantic search system** built with: - [Sentence-BERT (SBERT)](https://www.sbert.net/) for embeddings - [FAISS](https://faiss.ai/) for fast vector search - [MS MARCO v1.1 dataset](https://microsoft.github.io/msmarco/) (10,000 passages subset) - [Gradio](https://gradio.app/) for the interactive interface --- ## 🔹 Features - Enter a **query** to retrieve the **Top-10 most similar passages**. - Computes **true IR metrics** when the query matches one in MS MARCO validation set: - Precision@10 - Recall@10 - F1-score - Mean Reciprocal Rank (MRR) - Normalized Discounted Cumulative Gain (nDCG@10) --- ## 🔹 How to Use 1. Type a query into the input box. 2. Press **Submit**. 3. View: - **Top-10 retrieved passages** with similarity scores - **Evaluation metrics** if the query exists in the validation set --- ## 🔹 Tech Stack - **Embeddings:** `sentence-transformers/all-mpnet-base-v2` - **Indexing:** FAISS (L2 similarity) - **Dataset:** MS MARCO v1.1 (first 10,000 passages) - **Interface:** Gradio --- ## 🔹 Citation If you use this system in research, please cite: - [Sentence-BERT](https://arxiv.org/abs/1908.10084) - [MS MARCO](https://microsoft.github.io/msmarco/) --- ## 🔹 Author Built for a research project on **user-centered evaluation of semantic search systems**.