lee101 commited on
Commit
196e0f3
·
verified ·
1 Parent(s): 7c49de8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +139 -3
README.md CHANGED
@@ -1,3 +1,139 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ library_name: gobed
6
+ tags:
7
+ - embeddings
8
+ - semantic-search
9
+ - int8
10
+ - quantized
11
+ - static-embeddings
12
+ - sentence-embeddings
13
+ pipeline_tag: sentence-similarity
14
+ ---
15
+
16
+ # Bed - Int8 Quantized Static Embeddings for Semantic Search
17
+
18
+ Ultra-fast int8 quantized static embeddings model for semantic search. Optimized for the [gobed](https://github.com/lee101/gobed) Go library.
19
+
20
+ ## Model Details
21
+
22
+ | Property | Value |
23
+ |----------|-------|
24
+ | **Dimensions** | 512 |
25
+ | **Precision** | int8 + scale factors |
26
+ | **Vocabulary** | 30,522 tokens |
27
+ | **Model Size** | 15 MB |
28
+ | **Format** | safetensors |
29
+
30
+ ## Performance
31
+
32
+ - **Embedding latency**: 0.16ms average
33
+ - **Throughput**: 6,200+ embeddings/sec
34
+ - **Memory**: 15 MB (7.9x smaller than float32 version)
35
+ - **Compression ratio**: 87.4% space reduction vs original
36
+
37
+ ## Usage with gobed (Go)
38
+
39
+ ```bash
40
+ go get github.com/lee101/gobed
41
+ ```
42
+
43
+ ```go
44
+ package main
45
+
46
+ import (
47
+ "fmt"
48
+ "log"
49
+ "github.com/lee101/gobed"
50
+ )
51
+
52
+ func main() {
53
+ engine, err := gobed.NewAutoSearchEngine()
54
+ if err != nil {
55
+ log.Fatal(err)
56
+ }
57
+ defer engine.Close()
58
+
59
+ docs := map[string]string{
60
+ "doc1": "machine learning and neural networks",
61
+ "doc2": "natural language processing",
62
+ }
63
+ engine.AddDocuments(docs)
64
+
65
+ results, _, _ := engine.SearchWithMetadata("AI research", 3)
66
+ for _, r := range results {
67
+ fmt.Printf("[%.3f] %s\n", r.Similarity, r.Content)
68
+ }
69
+ }
70
+ ```
71
+
72
+ ## Download Model Manually
73
+
74
+ ```bash
75
+ # Clone the model repository
76
+ git clone https://huggingface.co/lee101/bed
77
+
78
+ # Or download specific files
79
+ wget https://huggingface.co/lee101/bed/resolve/main/modelint8_512dim.safetensors
80
+ wget https://huggingface.co/lee101/bed/resolve/main/tokenizer.json
81
+ ```
82
+
83
+ ## Using huggingface_hub (Python)
84
+
85
+ ```python
86
+ from huggingface_hub import hf_hub_download
87
+
88
+ # Download model file
89
+ model_path = hf_hub_download(repo_id="lee101/bed", filename="modelint8_512dim.safetensors")
90
+
91
+ # Download tokenizer
92
+ tokenizer_path = hf_hub_download(repo_id="lee101/bed", filename="tokenizer.json")
93
+ ```
94
+
95
+ ## Model Architecture
96
+
97
+ This model uses static embeddings with int8 quantization:
98
+
99
+ - **Embedding layer**: 30,522 x 512 int8 weights
100
+ - **Scale factors**: 30,522 float32 scale values (one per token)
101
+ - **Tokenizer**: WordPiece tokenizer (same as BERT)
102
+
103
+ Embeddings are computed by:
104
+ 1. Tokenizing input text
105
+ 2. Looking up int8 embeddings for each token
106
+ 3. Multiplying by scale factors to reconstruct float values
107
+ 4. Mean pooling across tokens
108
+
109
+ ## Quantization Details
110
+
111
+ Original model: 30,522 x 1024 float32 (119 MB)
112
+ Quantized model: 30,522 x 512 int8 + 30,522 float32 scales (15 MB)
113
+
114
+ Per-vector quantization preserves relative magnitudes:
115
+ ```python
116
+ max_abs = max(abs(embedding_vector))
117
+ scale = max_abs / 127.0
118
+ quantized = round(embedding_vector / scale).astype(int8)
119
+ ```
120
+
121
+ ## Files
122
+
123
+ - `modelint8_512dim.safetensors` - Quantized embeddings and scales
124
+ - `tokenizer.json` - HuggingFace tokenizer
125
+
126
+ ## License
127
+
128
+ MIT License - see [gobed repository](https://github.com/lee101/gobed) for details.
129
+
130
+ ## Citation
131
+
132
+ ```bibtex
133
+ @software{gobed,
134
+ author = {Lee Penkman},
135
+ title = {gobed: Ultra-Fast Semantic Search for Go},
136
+ url = {https://github.com/lee101/gobed},
137
+ year = {2024}
138
+ }
139
+ ```