deebak14 commited on
Commit
178a831
·
verified ·
1 Parent(s): 0bf5870

Add new SentenceTransformer model

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
2_Dense/config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "in_features": 768,
3
+ "out_features": 3072,
4
+ "bias": false,
5
+ "activation_function": "torch.nn.modules.linear.Identity"
6
+ }
2_Dense/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:326fa24fcde2b20cdcad2c34ce47d3905651291d1eb387b917c93c1464f3fea5
3
+ size 9437272
3_Dense/config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "in_features": 3072,
3
+ "out_features": 768,
4
+ "bias": false,
5
+ "activation_function": "torch.nn.modules.linear.Identity"
6
+ }
3_Dense/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6724be99108cbf32d3b6d7047fb001549527a428a6b4945f967a457159c3e811
3
+ size 9437272
README.md ADDED
@@ -0,0 +1,533 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0
  <br>explode (bool, optional): If True, the resulting surfaces ar not joined
1
  <br>following_geometry ({guid, ...]): List of curves, dots, and points which
2
  <br> should be unrolled with the surface<br><br>returns: <br>list(guid, ...): of unrolled surface ids
3
  <br>tuple((guid, ...),(guid, ...)): if following_geometry is not None, a tuple
4
  <br> [1] is the list of unrolled surface ids
5
  <br> [2] is the list of unrolled following geometry<br><br><br>Following is the code that uses this method to complete the task as per user query.<br><br>```python<br>import rhinoscriptsyntax as rs<br><br># Flatten a curved surface for laser cutting<br>surface = rs.GetObject("Select curved surface to flatten", rs.filter.surface)<br>if surface:<br> # Unrol...</code> | <code>You cannot use the following methods ConvertCurveToPolyline, MeshOutline, PullCurveToMesh, ExplodeText, MeshToNurb, IsCurvePlanar, Angle, AddFilletCurve, MeshVolume</code> |
 
 
6
  <br>height (number, optional) new font height<br><br>returns: <br>number: If height is not specified, the current text dot height
7
  <br>number: If height is specified, the previous text dot height
8
  <br>None: on error<br><br><br>Following is the code that uses this method to complete the task as per user query.<br><br>```python<br>import rhinoscriptsyntax as rs<br># Change the height of a text dot<br>obj = rs.GetObject("Select text dot")<br>if rs.IsTextDot(obj):<br> previous_height = rs.TextDotHeight(obj, 15.0) # Set new height to 15.0<br> print(f"Previous height was: {previous_height}")<br>```</code> | <code>You cannot use the following methods TextDotPoint, TextDotFont, TextDotText, TextObjectHeight, IsTextDot, AddTextDot, TextObjectFont, PointCoordinates, ExplodeText</code> |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ - generated_from_trainer
8
+ - dataset_size:15565
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: google/embeddinggemma-300m
11
+ widget:
12
+ - source_sentence: I need to lock an object in my model so I can work on other parts
13
+ without accidentally selecting it. How can I do that?
14
+ sentences:
15
+ - You cannot use the following methods IsObjectLocked, LockObjects, UnlockObject,
16
+ SelectObject, SelectObjects, UnlockObjects, IsObjectSelectable, ShowObject, IsObjectNormal
17
+ - object
18
+ - "You can use the following methods to complete the task.\nmethod: LockObject\n\
19
+ description: Locks a single object. Locked objects are visible, and they can be\r\
20
+ \n snapped to. But, they cannot be selected.\nsyntax: LockObject(object_id)\n\
21
+ parameters: object_id (guid): The identifier of an object\nreturns: bool: True\
22
+ \ or False indicating success or failure\n\nFollowing is the code that uses this\
23
+ \ method to complete the task as per user query.\n\n```python\nimport rhinoscriptsyntax\
24
+ \ as rs\n\n# Lock an object in the model to prevent accidental selection\nid =\
25
+ \ rs.GetObject(\"Select object to lock\")\nif id:\n rs.LockObject(id)\n \
26
+ \ print(\"Object locked successfully.\")\nelse:\n print(\"No object selected.\"\
27
+ )\n```"
28
+ - source_sentence: I want to create a cloud of points in my Rhino model. Can you show
29
+ me how to do that?
30
+ sentences:
31
+ - "You can use the following methods to complete the task.\nmethod: AddPointCloud\n\
32
+ description: Adds point cloud object to the document\nsyntax: AddPointCloud(points,\
33
+ \ colors=None)\nparameters: \npoints ([point, ....]): list of values where every\
34
+ \ multiple of three represents a point\r\ncolors ([color, ...]): list of colors\
35
+ \ to apply to each point\n\nreturns: \nguid: identifier of point cloud on success\n\
36
+ \n\nFollowing is the code that uses this method to complete the task as per user\
37
+ \ query.\n\n```python\nimport rhinoscriptsyntax as rs\n\n# Create a cloud of points\
38
+ \ in Rhino\npoints = [(0, 0, 0), (1, 1, 1), (2, 2, 2), (3, 3, 3)] # Define points\n\
39
+ rs.AddPointCloud(points) # Add the point cloud to the model\n```"
40
+ - geometry
41
+ - You cannot use the following methods PointCloudPoints, AddPoints, CreatePoint,
42
+ PointCloudCount, AddPoint, AddLine, PointCloudHidePoints, CreateVector, PointCoordinates
43
+ - source_sentence: I need to find out which vertices make up each face of my mesh.
44
+ Can you help me with that?
45
+ sentences:
46
+ - "You can use the following methods to complete the task.\nmethod: MeshFaces\n\
47
+ description: Returns face vertices of a mesh\nsyntax: MeshFaces(object_id, face_type=True)\n\
48
+ parameters: object_id (guid): identifier of a mesh object\nface_type (bool, optional):\
49
+ \ The face type to be returned. True = both triangles and quads. False = only\
50
+ \ triangles\nreturns: list([point, point, point, point], ...): 3D points that\
51
+ \ define the face vertices of the mesh. If face_type is True, then faces are returned\
52
+ \ as both quads and triangles (4 3D points). For triangles, the third and fourth\
53
+ \ vertex will be identical. If face_type is False, then faces are returned as\
54
+ \ only triangles(3 3D points). Quads will be converted to triangles.\n\nFollowing\
55
+ \ is the code that uses this method to complete the task as per user query.\n\n\
56
+ ```python\nimport rhinoscriptsyntax as rs\n# Get the mesh object from the user\n\
57
+ obj = rs.GetObject(\"Select mesh\", rs.filter.mesh)\n# Retrieve the vertex indices\
58
+ \ for each face of the mesh\nfaces = rs.MeshFaces(obj, True)\nif faces:\n rs.EnableRedraw(False)\n\
59
+ \ i = 0\n while i < len(faces):\n # Each face can be a triangle or\
60
+ \ a quad\n face = faces[i:i+4] if len(faces) > i + 3 else faces[i:i+3]\n\
61
+ \ print(\"Face vertices:\", face)\n i += 3 if len(face) == 3 else\
62
+ \ 4\n rs.EnableRedraw(True)\n```"
63
+ - You cannot use the following methods MeshVertexFaces, MeshFaceVertices, MeshVertices,
64
+ MeshVertexCount, MeshFaceCenters, MeshTriangleCount, MeshQuadCount, MeshFaceCount,
65
+ MeshNakedEdgePoints
66
+ - mesh
67
+ - source_sentence: Can you show me how to check if two transformation matrices are
68
+ the same in Rhino?
69
+ sentences:
70
+ - "You can use the following methods to complete the task.\nmethod: XformChangeBasis2\n\
71
+ description: Returns a change of basis transformation matrix of None on error\n\
72
+ syntax: XformChangeBasis2(x0,y0,z0,x1,y1,z1)\nparameters: \nx0,y0,z0 (vector):\
73
+ \ initial basis\r\nx1,y1,z1 (vector): final basis\n\nreturns: \ntransform: The\
74
+ \ 4x4 transformation matrix if successful\r\nNone: if not successful\n\n\nFollowing\
75
+ \ is the code that uses this method to complete the task as per user query.\n\n\
76
+ ```python\nimport rhinoscriptsyntax as rs\n\n# Function to check if two transformation\
77
+ \ matrices are the same\n# Parameters: mat1, mat2 - transformation matrices to\
78
+ \ compare\n# Returns: True if they are the same, False otherwise\ndef are_matrices_equal(mat1,\
79
+ \ mat2):\n return rs.XformCompare(mat1, mat2) == 0\n\n# Example usage\nmatrix1\
80
+ \ = rs.XformChangeBasis2(1, 0, 0, 0, 1, 0)\nmatrix2 = rs.XformChangeBasis2(1,\
81
+ \ 0, 0, 0, 1, 0)\nresult = are_matrices_equal(matrix1, matrix2)\nprint(\"Matrices\
82
+ \ are equal:\" , result)\n```"
83
+ - You cannot use the following methods XformCompare, IsXformSimilarity, IsXformIdentity,
84
+ IsXformZero, CompareGeometry, CreateXform, VectorTransform, XformTranslation,
85
+ TransformObject, XformDeterminant
86
+ - transformation
87
+ - source_sentence: I need to find where a flat surface meets a sphere. How can I do
88
+ that in Rhino?
89
+ sentences:
90
+ - plane
91
+ - You cannot use the following methods LineSphereIntersection, IsSphere, AddSphere,
92
+ LinePlaneIntersection, Angle, CircleCenterPoint, CurveCurveIntersection, CurveSurfaceIntersection,
93
+ AddCircle3Pt
94
+ - "You can use the following methods to complete the task.\nmethod: PlaneSphereIntersection\n\
95
+ description: Calculates the intersection of a plane and a sphere.\nsyntax: PlaneSphereIntersection(plane,\
96
+ \ sphere_plane, sphere_radius)\nparameters: plane (plane): The plane to intersect;\
97
+ \ sphere_plane (plane): Equatorial plane of the sphere (origin is center); sphere_radius\
98
+ \ (float): Radius of the sphere.\nreturns: list: [type, point/plane, radius] where\
99
+ \ type=0 for point, 1 for circle. None on error.\n\nFollowing is the code that\
100
+ \ uses this method to complete the task as per user query.\n\n```python\nimport\
101
+ \ rhinoscriptsyntax as rs\n\n# Define a flat surface as a plane\nplane = rs.WorldXYPlane()\n\
102
+ # Define the radius of the sphere\nradius = 10\n# Calculate the intersection between\
103
+ \ the plane and the sphere\nresults = rs.PlaneSphereIntersection(plane, plane,\
104
+ \ radius)\n\n# Check if there are results and handle them accordingly\nif results:\n\
105
+ \ if results[0] == 0:\n # If the intersection is a point, add it to\
106
+ \ the document\n rs.AddPoint(results[1])\n else:\n # If the intersection\
107
+ \ is a circle, add it to the document\n rs.AddCircle(results[1], results[2])\n\
108
+ ```"
109
+ datasets:
110
+ - deebak14/embedding_tuple_data_v1
111
+ - deebak14/embedding_triplet_data_v1
112
+ pipeline_tag: sentence-similarity
113
+ library_name: sentence-transformers
114
+ metrics:
115
+ - cosine_accuracy
116
+ model-index:
117
+ - name: SentenceTransformer based on google/embeddinggemma-300m
118
+ results:
119
+ - task:
120
+ type: triplet
121
+ name: Triplet
122
+ dataset:
123
+ name: base eval
124
+ type: base-eval
125
+ metrics:
126
+ - type: cosine_accuracy
127
+ value: 1.0
128
+ name: Cosine Accuracy
129
+ ---
130
+
131
+ # SentenceTransformer based on google/embeddinggemma-300m
132
+
133
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [google/embeddinggemma-300m](https://huggingface.co/google/embeddinggemma-300m) on the [embedding_tuple_data_v1](https://huggingface.co/datasets/deebak14/embedding_tuple_data_v1) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
134
+
135
+ ## Model Details
136
+
137
+ ### Model Description
138
+ - **Model Type:** Sentence Transformer
139
+ - **Base model:** [google/embeddinggemma-300m](https://huggingface.co/google/embeddinggemma-300m) <!-- at revision 57c266a740f537b4dc058e1b0cda161fd15afa75 -->
140
+ - **Maximum Sequence Length:** 2048 tokens
141
+ - **Output Dimensionality:** 768 dimensions
142
+ - **Similarity Function:** Cosine Similarity
143
+ - **Training Dataset:**
144
+ - [embedding_tuple_data_v1](https://huggingface.co/datasets/deebak14/embedding_tuple_data_v1)
145
+ <!-- - **Language:** Unknown -->
146
+ <!-- - **License:** Unknown -->
147
+
148
+ ### Model Sources
149
+
150
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
151
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
152
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
153
+
154
+ ### Full Model Architecture
155
+
156
+ ```
157
+ SentenceTransformer(
158
+ (0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
159
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
160
+ (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
161
+ (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
162
+ (4): Normalize()
163
+ )
164
+ ```
165
+
166
+ ## Usage
167
+
168
+ ### Direct Usage (Sentence Transformers)
169
+
170
+ First install the Sentence Transformers library:
171
+
172
+ ```bash
173
+ pip install -U sentence-transformers
174
+ ```
175
+
176
+ Then you can load this model and run inference.
177
+ ```python
178
+ from sentence_transformers import SentenceTransformer
179
+
180
+ # Download from the 🤗 Hub
181
+ model = SentenceTransformer("deebak14/embedding_gemma_ft_v1")
182
+ # Run inference
183
+ queries = [
184
+ "I need to find where a flat surface meets a sphere. How can I do that in Rhino?",
185
+ ]
186
+ documents = [
187
+ 'You can use the following methods to complete the task.\nmethod: PlaneSphereIntersection\ndescription: Calculates the intersection of a plane and a sphere.\nsyntax: PlaneSphereIntersection(plane, sphere_plane, sphere_radius)\nparameters: plane (plane): The plane to intersect; sphere_plane (plane): Equatorial plane of the sphere (origin is center); sphere_radius (float): Radius of the sphere.\nreturns: list: [type, point/plane, radius] where type=0 for point, 1 for circle. None on error.\n\nFollowing is the code that uses this method to complete the task as per user query.\n\n```python\nimport rhinoscriptsyntax as rs\n\n# Define a flat surface as a plane\nplane = rs.WorldXYPlane()\n# Define the radius of the sphere\nradius = 10\n# Calculate the intersection between the plane and the sphere\nresults = rs.PlaneSphereIntersection(plane, plane, radius)\n\n# Check if there are results and handle them accordingly\nif results:\n if results[0] == 0:\n # If the intersection is a point, add it to the document\n rs.AddPoint(results[1])\n else:\n # If the intersection is a circle, add it to the document\n rs.AddCircle(results[1], results[2])\n```',
188
+ 'You cannot use the following methods LineSphereIntersection, IsSphere, AddSphere, LinePlaneIntersection, Angle, CircleCenterPoint, CurveCurveIntersection, CurveSurfaceIntersection, AddCircle3Pt',
189
+ 'plane',
190
+ ]
191
+ query_embeddings = model.encode_query(queries)
192
+ document_embeddings = model.encode_document(documents)
193
+ print(query_embeddings.shape, document_embeddings.shape)
194
+ # [1, 768] [3, 768]
195
+
196
+ # Get the similarity scores for the embeddings
197
+ similarities = model.similarity(query_embeddings, document_embeddings)
198
+ print(similarities)
199
+ # tensor([[ 0.6658, 0.4819, -0.1617]])
200
+ ```
201
+
202
+ <!--
203
+ ### Direct Usage (Transformers)
204
+
205
+ <details><summary>Click to see the direct usage in Transformers</summary>
206
+
207
+ </details>
208
+ -->
209
+
210
+ <!--
211
+ ### Downstream Usage (Sentence Transformers)
212
+
213
+ You can finetune this model on your own dataset.
214
+
215
+ <details><summary>Click to expand</summary>
216
+
217
+ </details>
218
+ -->
219
+
220
+ <!--
221
+ ### Out-of-Scope Use
222
+
223
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
224
+ -->
225
+
226
+ ## Evaluation
227
+
228
+ ### Metrics
229
+
230
+ #### Triplet
231
+
232
+ * Dataset: `base-eval`
233
+ * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
234
+
235
+ | Metric | Value |
236
+ |:--------------------|:--------|
237
+ | **cosine_accuracy** | **1.0** |
238
+
239
+ <!--
240
+ ## Bias, Risks and Limitations
241
+
242
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
243
+ -->
244
+
245
+ <!--
246
+ ### Recommendations
247
+
248
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
249
+ -->
250
+
251
+ ## Training Details
252
+
253
+ ### Training Dataset
254
+
255
+ #### embedding_tuple_data_v1
256
+
257
+ * Dataset: [embedding_tuple_data_v1](https://huggingface.co/datasets/deebak14/embedding_tuple_data_v1) at [b592a1a](https://huggingface.co/datasets/deebak14/embedding_tuple_data_v1/tree/b592a1af60cff995640f6979dbf36c01d38c40a8)
258
+ * Size: 15,565 training samples
259
+ * Columns: <code>anchor</code> and <code>positive</code>
260
+ * Approximate statistics based on the first 1000 samples:
261
+ | | anchor | positive |
262
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
263
+ | type | string | string |
264
+ | details | <ul><li>min: 9 tokens</li><li>mean: 15.98 tokens</li><li>max: 50 tokens</li></ul> | <ul><li>min: 42 tokens</li><li>mean: 177.53 tokens</li><li>max: 810 tokens</li></ul> |
265
+ * Samples:
266
+ | anchor | positive |
267
+ |:-------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
268
+ | <code>Provide an example of using AddRectangle.</code> | <code>import rhinoscriptsyntax as rs<br>plane = rs.WorldXYPlane()<br>plane = rs.RotatePlane(plane, 45.0, [0,0,1])<br>rs.AddRectangle(plane, 5.0, 15.0) <br><br>Metadata: <br>Name: AddRectangle<br>Category: curve<br>Function Signature: rs.AddRectangle(plane: plane, width: number, height: number) -> guid<br>Description: Add a rectangular curve to the document</code> |
269
+ | <code>How do I search for the total number of linetypes in my document?</code> | <code><br><br>You can use the following method:<br>Name: LinetypeCount<br>Category: linetype<br>Function Signature: rs.LinetypeCount() -> int<br>Description: Description: Returns the number of linetypes in the document.<br>Parameters:<br> None<br>Returns:<br> int: The number of linetypes in the document.</code> |
270
+ | <code>How do I maintain the shape of a curve while fitting it?</code> | <code><br><br>You can use the following method:<br>Name: FitCurve<br>Category: curve<br>Function Signature: rs.FitCurve(curve_id: guid, degree: int = 3, distance_tolerance: float = -1, angle_tolerance: float = -1) -> guid<br>Description: Description: Reduces the number of control points of a curve while maintaining its general shape. This function is useful for replacing curves with many control points. For more information, see the Rhino help for the FitCrv command.<br>Parameters:<br> curve_id (guid): Identifier of the curve object to be fitted.<br> eg: '3D4F5A6B-7C8D-9E0F-1A2B-3C4D5E6F7A8B'<br> degree (int, optional): The degree of the curve, which must be greater than 1. The default is 3.<br> eg: 3<br> distance_tolerance (float, optional): The fitting tolerance. If not specified or <= 0.0, the document absolute tolerance is used.<br> eg: 0.01<br> angle_tolerance (float, optional): The kink smoothing tolerance in degrees. If 0.0, all kinks are smoothed. If > 0.0, kinks smaller than this value are smoothed. If ...</code> |
271
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
272
+ ```json
273
+ {
274
+ "scale": 20.0,
275
+ "similarity_fct": "cos_sim",
276
+ "gather_across_devices": false
277
+ }
278
+ ```
279
+
280
+ ### Evaluation Dataset
281
+
282
+ #### embedding_triplet_data_v1
283
+
284
+ * Dataset: [embedding_triplet_data_v1](https://huggingface.co/datasets/deebak14/embedding_triplet_data_v1) at [71ea1de](https://huggingface.co/datasets/deebak14/embedding_triplet_data_v1/tree/71ea1de1dd869a91bfa545c4f65e1d1d5ac41186)
285
+ * Size: 476 evaluation samples
286
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
287
+ * Approximate statistics based on the first 476 samples:
288
+ | | anchor | positive | negative |
289
+ |:--------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
290
+ | type | string | string | string |
291
+ | details | <ul><li>min: 15 tokens</li><li>mean: 23.13 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 111 tokens</li><li>mean: 252.44 tokens</li><li>max: 1015 tokens</li></ul> | <ul><li>min: 32 tokens</li><li>mean: 42.48 tokens</li><li>max: 55 tokens</li></ul> |
292
+ * Samples:
293
+ | anchor | positive | negative |
294
+ |:------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
295
+ | <code>I need to flatten a curved surface for laser cutting. How can I do that?</code> | <code>You can use the following methods to complete the task.<br>method: UnrollSurface<br>description: Flattens a developable surface or polysurface<br>syntax: UnrollSurface(surface_id, explode=False, following_geometry=None, absolute_tolerance=None, relative_tolerance=None)<br>parameters: <br>surface_id (guid): the surface's identifier
296
  <br>explode (bool, optional): If True, the resulting surfaces ar not joined
297
  <br>following_geometry ({guid, ...]): List of curves, dots, and points which
298
  <br> should be unrolled with the surface<br><br>returns: <br>list(guid, ...): of unrolled surface ids
299
  <br>tuple((guid, ...),(guid, ...)): if following_geometry is not None, a tuple
300
  <br> [1] is the list of unrolled surface ids
301
  <br> [2] is the list of unrolled following geometry<br><br><br>Following is the code that uses this method to complete the task as per user query.<br><br>```python<br>import rhinoscriptsyntax as rs<br><br># Flatten a curved surface for laser cutting<br>surface = rs.GetObject("Select curved surface to flatten", rs.filter.surface)<br>if surface:<br> # Unrol...</code> | <code>You cannot use the following methods ConvertCurveToPolyline, MeshOutline, PullCurveToMesh, ExplodeText, MeshToNurb, IsCurvePlanar, Angle, AddFilletCurve, MeshVolume</code> |
302
+ | <code>Can you show me how to rotate a plane by 30 degrees around its normal axis?</code> | <code>You can use the following methods to complete the task.<br>method: PlaneTransform<br>description: Transforms a plane.<br>syntax: PlaneTransform(plane, xform)<br>parameters: plane (plane): Plane to transform; xform (transform): Transformation to apply.<br>returns: plane: Resulting plane if successful, otherwise None.<br><br>Following is the code that uses this method to complete the task as per user query.<br><br>```python<br>import rhinoscriptsyntax as rs<br><br># Get the current construction plane<br>plane = rs.ViewCPlane()<br><br># Create a rotation transformation of 30 degrees around the normal axis (Z-axis)<br>xform = rs.XformRotation(30.0, plane.ZAxis, plane.Origin)<br><br># Apply the transformation to the plane<br>plane = rs.PlaneTransform(plane, xform)<br><br># Set the new construction plane<br>rs.ViewCPlane(None, plane)<br>```</code> | <code>You cannot use the following methods RotatePlane, XformRotation1, PlaneFromNormal, VectorRotate, Angle, PlaneFromFrame, LinePlane, MovePlane, CreatePlane</code> |
303
+ | <code>I want to change the height of a text dot I just created. How can I do that?</code> | <code>You can use the following methods to complete the task.<br>method: TextDotHeight<br>description: Returns or modified the font height of a text dot<br>syntax: TextDotHeight(object_id, height=None)<br>parameters: <br>object_id (guid): identifier of a text dot object
304
  <br>height (number, optional) new font height<br><br>returns: <br>number: If height is not specified, the current text dot height
305
  <br>number: If height is specified, the previous text dot height
306
  <br>None: on error<br><br><br>Following is the code that uses this method to complete the task as per user query.<br><br>```python<br>import rhinoscriptsyntax as rs<br># Change the height of a text dot<br>obj = rs.GetObject("Select text dot")<br>if rs.IsTextDot(obj):<br> previous_height = rs.TextDotHeight(obj, 15.0) # Set new height to 15.0<br> print(f"Previous height was: {previous_height}")<br>```</code> | <code>You cannot use the following methods TextDotPoint, TextDotFont, TextDotText, TextObjectHeight, IsTextDot, AddTextDot, TextObjectFont, PointCoordinates, ExplodeText</code> |
307
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
308
+ ```json
309
+ {
310
+ "scale": 20.0,
311
+ "similarity_fct": "cos_sim",
312
+ "gather_across_devices": false
313
+ }
314
+ ```
315
+
316
+ ### Training Hyperparameters
317
+ #### Non-Default Hyperparameters
318
+
319
+ - `eval_strategy`: steps
320
+ - `per_device_train_batch_size`: 16
321
+ - `per_device_eval_batch_size`: 16
322
+ - `learning_rate`: 2e-05
323
+ - `warmup_ratio`: 0.1
324
+ - `bf16`: True
325
+ - `prompts`: task: sentence similarity | query:
326
+ - `batch_sampler`: no_duplicates
327
+
328
+ #### All Hyperparameters
329
+ <details><summary>Click to expand</summary>
330
+
331
+ - `overwrite_output_dir`: False
332
+ - `do_predict`: False
333
+ - `eval_strategy`: steps
334
+ - `prediction_loss_only`: True
335
+ - `per_device_train_batch_size`: 16
336
+ - `per_device_eval_batch_size`: 16
337
+ - `per_gpu_train_batch_size`: None
338
+ - `per_gpu_eval_batch_size`: None
339
+ - `gradient_accumulation_steps`: 1
340
+ - `eval_accumulation_steps`: None
341
+ - `torch_empty_cache_steps`: None
342
+ - `learning_rate`: 2e-05
343
+ - `weight_decay`: 0.0
344
+ - `adam_beta1`: 0.9
345
+ - `adam_beta2`: 0.999
346
+ - `adam_epsilon`: 1e-08
347
+ - `max_grad_norm`: 1.0
348
+ - `num_train_epochs`: 3
349
+ - `max_steps`: -1
350
+ - `lr_scheduler_type`: linear
351
+ - `lr_scheduler_kwargs`: {}
352
+ - `warmup_ratio`: 0.1
353
+ - `warmup_steps`: 0
354
+ - `log_level`: passive
355
+ - `log_level_replica`: warning
356
+ - `log_on_each_node`: True
357
+ - `logging_nan_inf_filter`: True
358
+ - `save_safetensors`: True
359
+ - `save_on_each_node`: False
360
+ - `save_only_model`: False
361
+ - `restore_callback_states_from_checkpoint`: False
362
+ - `no_cuda`: False
363
+ - `use_cpu`: False
364
+ - `use_mps_device`: False
365
+ - `seed`: 42
366
+ - `data_seed`: None
367
+ - `jit_mode_eval`: False
368
+ - `use_ipex`: False
369
+ - `bf16`: True
370
+ - `fp16`: False
371
+ - `fp16_opt_level`: O1
372
+ - `half_precision_backend`: auto
373
+ - `bf16_full_eval`: False
374
+ - `fp16_full_eval`: False
375
+ - `tf32`: None
376
+ - `local_rank`: 0
377
+ - `ddp_backend`: None
378
+ - `tpu_num_cores`: None
379
+ - `tpu_metrics_debug`: False
380
+ - `debug`: []
381
+ - `dataloader_drop_last`: False
382
+ - `dataloader_num_workers`: 0
383
+ - `dataloader_prefetch_factor`: None
384
+ - `past_index`: -1
385
+ - `disable_tqdm`: False
386
+ - `remove_unused_columns`: True
387
+ - `label_names`: None
388
+ - `load_best_model_at_end`: False
389
+ - `ignore_data_skip`: False
390
+ - `fsdp`: []
391
+ - `fsdp_min_num_params`: 0
392
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
393
+ - `fsdp_transformer_layer_cls_to_wrap`: None
394
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
395
+ - `parallelism_config`: None
396
+ - `deepspeed`: None
397
+ - `label_smoothing_factor`: 0.0
398
+ - `optim`: adamw_torch_fused
399
+ - `optim_args`: None
400
+ - `adafactor`: False
401
+ - `group_by_length`: False
402
+ - `length_column_name`: length
403
+ - `ddp_find_unused_parameters`: None
404
+ - `ddp_bucket_cap_mb`: None
405
+ - `ddp_broadcast_buffers`: False
406
+ - `dataloader_pin_memory`: True
407
+ - `dataloader_persistent_workers`: False
408
+ - `skip_memory_metrics`: True
409
+ - `use_legacy_prediction_loop`: False
410
+ - `push_to_hub`: False
411
+ - `resume_from_checkpoint`: None
412
+ - `hub_model_id`: None
413
+ - `hub_strategy`: every_save
414
+ - `hub_private_repo`: None
415
+ - `hub_always_push`: False
416
+ - `hub_revision`: None
417
+ - `gradient_checkpointing`: False
418
+ - `gradient_checkpointing_kwargs`: None
419
+ - `include_inputs_for_metrics`: False
420
+ - `include_for_metrics`: []
421
+ - `eval_do_concat_batches`: True
422
+ - `fp16_backend`: auto
423
+ - `push_to_hub_model_id`: None
424
+ - `push_to_hub_organization`: None
425
+ - `mp_parameters`:
426
+ - `auto_find_batch_size`: False
427
+ - `full_determinism`: False
428
+ - `torchdynamo`: None
429
+ - `ray_scope`: last
430
+ - `ddp_timeout`: 1800
431
+ - `torch_compile`: False
432
+ - `torch_compile_backend`: None
433
+ - `torch_compile_mode`: None
434
+ - `include_tokens_per_second`: False
435
+ - `include_num_input_tokens_seen`: False
436
+ - `neftune_noise_alpha`: None
437
+ - `optim_target_modules`: None
438
+ - `batch_eval_metrics`: False
439
+ - `eval_on_start`: False
440
+ - `use_liger_kernel`: False
441
+ - `liger_kernel_config`: None
442
+ - `eval_use_gather_object`: False
443
+ - `average_tokens_across_devices`: False
444
+ - `prompts`: task: sentence similarity | query:
445
+ - `batch_sampler`: no_duplicates
446
+ - `multi_dataset_batch_sampler`: proportional
447
+ - `router_mapping`: {}
448
+ - `learning_rate_mapping`: {}
449
+
450
+ </details>
451
+
452
+ ### Training Logs
453
+ | Epoch | Step | Training Loss | Validation Loss | base-eval_cosine_accuracy |
454
+ |:------:|:----:|:-------------:|:---------------:|:-------------------------:|
455
+ | -1 | -1 | - | - | 0.0147 |
456
+ | 0.1028 | 100 | 0.1601 | - | - |
457
+ | 0.2055 | 200 | 0.0474 | 0.2296 | 0.8971 |
458
+ | 0.3083 | 300 | 0.0749 | - | - |
459
+ | 0.4111 | 400 | 0.1037 | 0.1457 | 0.9265 |
460
+ | 0.5139 | 500 | 0.0564 | - | - |
461
+ | 0.6166 | 600 | 0.0706 | 0.3362 | 0.9475 |
462
+ | 0.7194 | 700 | 0.0549 | - | - |
463
+ | 0.8222 | 800 | 0.0427 | 0.2154 | 0.9538 |
464
+ | 0.9250 | 900 | 0.0599 | - | - |
465
+ | 1.0277 | 1000 | 0.0656 | 0.2439 | 0.9706 |
466
+ | 1.1305 | 1100 | 0.0409 | - | - |
467
+ | 1.2333 | 1200 | 0.0283 | 0.2422 | 0.9727 |
468
+ | 1.3361 | 1300 | 0.0336 | - | - |
469
+ | 1.4388 | 1400 | 0.0338 | 0.2397 | 0.9664 |
470
+ | 1.5416 | 1500 | 0.0384 | - | - |
471
+ | 1.6444 | 1600 | 0.0271 | 0.1048 | 0.9832 |
472
+ | 1.7472 | 1700 | 0.0305 | - | - |
473
+ | 1.8499 | 1800 | 0.024 | 0.1172 | 0.9916 |
474
+ | 1.9527 | 1900 | 0.014 | - | - |
475
+ | 2.0555 | 2000 | 0.018 | 0.0898 | 0.9958 |
476
+ | 2.1583 | 2100 | 0.0091 | - | - |
477
+ | 2.2610 | 2200 | 0.0154 | 0.0721 | 0.9916 |
478
+ | 2.3638 | 2300 | 0.0123 | - | - |
479
+ | 2.4666 | 2400 | 0.0119 | 0.0876 | 0.9937 |
480
+ | 2.5694 | 2500 | 0.0173 | - | - |
481
+ | 2.6721 | 2600 | 0.0091 | 0.0482 | 1.0 |
482
+ | 2.7749 | 2700 | 0.0211 | - | - |
483
+ | 2.8777 | 2800 | 0.0146 | 0.0550 | 1.0 |
484
+ | 2.9805 | 2900 | 0.0101 | - | - |
485
+ | -1 | -1 | - | - | 1.0 |
486
+
487
+
488
+ ### Framework Versions
489
+ - Python: 3.12.11
490
+ - Sentence Transformers: 5.1.0
491
+ - Transformers: 4.56.1
492
+ - PyTorch: 2.8.0+cu126
493
+ - Accelerate: 1.10.1
494
+ - Datasets: 4.0.0
495
+ - Tokenizers: 0.22.0
496
+
497
+ ## Citation
498
+
499
+ ### BibTeX
500
+
501
+ #### Sentence Transformers
502
+ ```bibtex
503
+ @inproceedings{reimers-2019-sentence-bert,
504
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
505
+ author = "Reimers, Nils and Gurevych, Iryna",
506
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
507
+ month = "11",
508
+ year = "2019",
509
+ publisher = "Association for Computational Linguistics",
510
+ url = "https://arxiv.org/abs/1908.10084",
511
+ }
512
+ ```
513
+
514
+ #### MultipleNegativesRankingLoss
515
+ ```bibtex
516
+ @misc{henderson2017efficient,
517
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
518
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
519
+ year={2017},
520
+ eprint={1705.00652},
521
+ archivePrefix={arXiv},
522
+ primaryClass={cs.CL}
523
+ }
524
+ ```
525
+
526
+ <!--
527
+ ## Glossary
528
+
529
+ *Clearly define terms in order to be accessible across audiences.*
530
+ -->
531
+
532
+ <!--
533
+ ## Model Card Authors
534
+
535
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
536
+ -->
537
+
538
+ <!--
539
+ ## Model Card Contact
540
+
541
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
542
+ -->
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "<image_soft_token>": 262144
3
+ }
config.json ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_sliding_window_pattern": 6,
3
+ "architectures": [
4
+ "Gemma3TextModel"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "attn_logit_softcapping": null,
9
+ "bos_token_id": 2,
10
+ "dtype": "float32",
11
+ "eos_token_id": 1,
12
+ "final_logit_softcapping": null,
13
+ "head_dim": 256,
14
+ "hidden_activation": "gelu_pytorch_tanh",
15
+ "hidden_size": 768,
16
+ "initializer_range": 0.02,
17
+ "intermediate_size": 1152,
18
+ "layer_types": [
19
+ "sliding_attention",
20
+ "sliding_attention",
21
+ "sliding_attention",
22
+ "sliding_attention",
23
+ "sliding_attention",
24
+ "full_attention",
25
+ "sliding_attention",
26
+ "sliding_attention",
27
+ "sliding_attention",
28
+ "sliding_attention",
29
+ "sliding_attention",
30
+ "full_attention",
31
+ "sliding_attention",
32
+ "sliding_attention",
33
+ "sliding_attention",
34
+ "sliding_attention",
35
+ "sliding_attention",
36
+ "full_attention",
37
+ "sliding_attention",
38
+ "sliding_attention",
39
+ "sliding_attention",
40
+ "sliding_attention",
41
+ "sliding_attention",
42
+ "full_attention"
43
+ ],
44
+ "max_position_embeddings": 2048,
45
+ "model_type": "gemma3_text",
46
+ "num_attention_heads": 3,
47
+ "num_hidden_layers": 24,
48
+ "num_key_value_heads": 1,
49
+ "pad_token_id": 0,
50
+ "query_pre_attn_scalar": 256,
51
+ "rms_norm_eps": 1e-06,
52
+ "rope_local_base_freq": 10000.0,
53
+ "rope_scaling": null,
54
+ "rope_theta": 1000000.0,
55
+ "sliding_window": 512,
56
+ "transformers_version": "4.56.1",
57
+ "use_bidirectional_attention": true,
58
+ "use_cache": true,
59
+ "vocab_size": 262144
60
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SentenceTransformer",
3
+ "__version__": {
4
+ "sentence_transformers": "5.1.0",
5
+ "transformers": "4.56.1",
6
+ "pytorch": "2.8.0+cu126"
7
+ },
8
+ "prompts": {
9
+ "query": "task: search result | query: ",
10
+ "document": "title: none | text: ",
11
+ "BitextMining": "task: search result | query: ",
12
+ "Clustering": "task: clustering | query: ",
13
+ "Classification": "task: classification | query: ",
14
+ "InstructionRetrieval": "task: code retrieval | query: ",
15
+ "MultilabelClassification": "task: classification | query: ",
16
+ "PairClassification": "task: sentence similarity | query: ",
17
+ "Reranking": "task: search result | query: ",
18
+ "Retrieval": "task: search result | query: ",
19
+ "Retrieval-query": "task: search result | query: ",
20
+ "Retrieval-document": "title: none | text: ",
21
+ "STS": "task: sentence similarity | query: ",
22
+ "Summarization": "task: summarization | query: "
23
+ },
24
+ "default_prompt_name": null,
25
+ "similarity_fn_name": "cosine"
26
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:87e90ab39c71442f41afac0d7a6b3dc33bbb4ca1e7e48f5fb2e68bae16854c5d
3
+ size 1211486072
modules.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Dense",
18
+ "type": "sentence_transformers.models.Dense"
19
+ },
20
+ {
21
+ "idx": 3,
22
+ "name": "3",
23
+ "path": "3_Dense",
24
+ "type": "sentence_transformers.models.Dense"
25
+ },
26
+ {
27
+ "idx": 4,
28
+ "name": "4",
29
+ "path": "4_Normalize",
30
+ "type": "sentence_transformers.models.Normalize"
31
+ }
32
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 2048,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "boi_token": "<start_of_image>",
3
+ "bos_token": {
4
+ "content": "<bos>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false
9
+ },
10
+ "eoi_token": "<end_of_image>",
11
+ "eos_token": {
12
+ "content": "<eos>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false
17
+ },
18
+ "image_token": "<image_soft_token>",
19
+ "pad_token": {
20
+ "content": "<pad>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false
25
+ },
26
+ "unk_token": {
27
+ "content": "<unk>",
28
+ "lstrip": false,
29
+ "normalized": false,
30
+ "rstrip": false,
31
+ "single_word": false
32
+ }
33
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:216e2a79606fe879c9f17c529c71cd241338407fd5646b595ffd3c4b9ea1d503
3
+ size 33385262
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
3
+ size 4689074
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff