File size: 1,236 Bytes
f2e9e7e a955ea0 7235e65 a955ea0 f2e9e7e bec85f8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
---
pipeline_tag: sentence-similarity
base_model: Qwen/Qwen3-Embedding-0.6B
license: apache-2.0
library_name: peft
tags:
- peft
- lora
- sentence-transformers
- semantic-search
- faiss
- hnsw
- mathlib4
- lean4
---
# Lightweight Semantic Mathlib Search — LoRA Adapter (Demo)
**What this is:**
A small, reproducible demo adapter for `Qwen/Qwen3-Embedding-0.6B` that enables *lightweight, LLM-free* semantic search over mathlib (Lean 4).
Supports both natural language and formula-based search (can be used in combination) for finding Lean theorems or mathematical statements.
Does not support LaTeX syntax search yet.
Data is built from ~180k Lean theorems extracted via LeanDojo for **Lean 4.20.1** (there may be minor gaps vs. the full mathlib).
> Part of the AITP 2025 submission: **“Towards Lightweight and LLM-Free Semantic Search for mathlib4.”**
## Contents
- `adapter/adapter_config.json` & `adapter/adapter_model.safetensors` — LoRA weights for the encoder
- `faiss/<*.idx>` — HNSW FAISS index (built for cosine/inner-product search)
- `faiss/*public_index.jsonl` — public mapping `{ idx, full_name, statement }`
## DEMO
https://github.com/IsaacLi74/Lightweight-and-LLM-Free-Semantic-Search-for-mathlib4 |