---
title: SBERT + FAISS Semantic Search
emoji: 🔍
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.45.0
app_file: app.py
pinned: false
---

# SBERT + FAISS Semantic Search + Evaluation Metrics

This Hugging Face Space hosts a **semantic search system** built with:

- [Sentence-BERT (SBERT)](https://www.sbert.net/) for embeddings  
- [FAISS](https://faiss.ai/) for fast vector search  
- [MS MARCO v1.1 dataset](https://microsoft.github.io/msmarco/) (10,000 passages subset)  
- [Gradio](https://gradio.app/) for the interactive interface  

---

## 🔹 Features
- Enter a **query** to retrieve the **Top-10 most similar passages**.  
- Computes **true IR metrics** when the query matches one in MS MARCO validation set:
  - Precision@10  
  - Recall@10  
  - F1-score  
  - Mean Reciprocal Rank (MRR)  
  - Normalized Discounted Cumulative Gain (nDCG@10)  

---

## 🔹 How to Use
1. Type a query into the input box.  
2. Press **Submit**.  
3. View:  
   - **Top-10 retrieved passages** with similarity scores  
   - **Evaluation metrics** if the query exists in the validation set  

---

## 🔹 Tech Stack
- **Embeddings:** `sentence-transformers/all-mpnet-base-v2`  
- **Indexing:** FAISS (L2 similarity)  
- **Dataset:** MS MARCO v1.1 (first 10,000 passages)  
- **Interface:** Gradio  

---

## 🔹 Citation
If you use this system in research, please cite:

- [Sentence-BERT](https://arxiv.org/abs/1908.10084)  
- [MS MARCO](https://microsoft.github.io/msmarco/)  

---

## 🔹 Author
Built for a research project on **user-centered evaluation of semantic search systems**.