adrien-riaux commited on
Commit
8970a21
·
verified ·
1 Parent(s): 08ccd47
Files changed (1) hide show
  1. README.md +13 -10
README.md CHANGED
@@ -6,22 +6,23 @@ tags:
6
  base_model: nomic-ai/modernbert-embed-base
7
  pipeline_tag: sentence-similarity
8
  library_name: sentence-transformers
9
- license: mit
10
  ---
11
 
12
- # ModernBERT Embed Base Distilled
13
 
14
- This is a [sentence-transformers](https://www.SBERT.net) model distilled from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base). It maps sentences & paragraphs to a 256-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
15
 
16
  ## Model Details
17
 
18
  ### Model Description
19
  - **Model Type:** Sentence Transformer
20
  - **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
21
- - **Maximum Sequence Length:** 8 192 tokens
22
  - **Output Dimensionality:** 256 dimensions
23
  - **Similarity Function:** Cosine Similarity
24
-
 
 
25
 
26
  ### Model Sources
27
 
@@ -54,7 +55,7 @@ Then you can load this model and run inference.
54
  from sentence_transformers import SentenceTransformer
55
 
56
  # Download from the 🤗 Hub
57
- model = SentenceTransformer("adrien-riaux/distill-modernbert-embed-base")
58
  # Run inference
59
  sentences = [
60
  'The weather is lovely today.',
@@ -109,17 +110,19 @@ You can finetune this model on your own dataset.
109
 
110
  ## Training Details
111
 
112
- ### Distillation Process
113
-
114
- The model is distilled using [Model2Vec](https://huggingface.co/blog/Pringled/model2vec) framework. It is a new technique for creating extremely fast and small static embedding models from any Sentence Transformer.
115
-
116
  ### Framework Versions
117
  - Python: 3.11.9
118
  - Sentence Transformers: 3.4.1
119
  - Transformers: 4.48.3
120
  - PyTorch: 2.2.2
 
 
121
  - Tokenizers: 0.21.0
122
 
 
 
 
 
123
  <!--
124
  ## Glossary
125
 
 
6
  base_model: nomic-ai/modernbert-embed-base
7
  pipeline_tag: sentence-similarity
8
  library_name: sentence-transformers
 
9
  ---
10
 
11
+ # SentenceTransformer based on nomic-ai/modernbert-embed-base
12
 
13
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base). It maps sentences & paragraphs to a 256-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
14
 
15
  ## Model Details
16
 
17
  ### Model Description
18
  - **Model Type:** Sentence Transformer
19
  - **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
20
+ - **Maximum Sequence Length:** inf tokens
21
  - **Output Dimensionality:** 256 dimensions
22
  - **Similarity Function:** Cosine Similarity
23
+ <!-- - **Training Dataset:** Unknown -->
24
+ <!-- - **Language:** Unknown -->
25
+ <!-- - **License:** Unknown -->
26
 
27
  ### Model Sources
28
 
 
55
  from sentence_transformers import SentenceTransformer
56
 
57
  # Download from the 🤗 Hub
58
+ model = SentenceTransformer("AdrienRiaux/distill-modernbert-embed-base")
59
  # Run inference
60
  sentences = [
61
  'The weather is lovely today.',
 
110
 
111
  ## Training Details
112
 
 
 
 
 
113
  ### Framework Versions
114
  - Python: 3.11.9
115
  - Sentence Transformers: 3.4.1
116
  - Transformers: 4.48.3
117
  - PyTorch: 2.2.2
118
+ - Accelerate:
119
+ - Datasets:
120
  - Tokenizers: 0.21.0
121
 
122
+ ## Citation
123
+
124
+ ### BibTeX
125
+
126
  <!--
127
  ## Glossary
128