hyunsikc commited on
Commit
5e8e2e7
verified
1 Parent(s): 527b41a

Add files using upload-large-folder tool

Browse files
.gitattributes CHANGED
@@ -33,3 +33,9 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ 8a7b90c915c1cecaf381c70594e3f955.edf filter=lfs diff=lfs merge=lfs -text
37
+ 97bb3cab5f2f7f5f4640c04cbf3b6ee0.edf filter=lfs diff=lfs merge=lfs -text
38
+ 9ad47915b97d47d3ce069c00271807d6.edf filter=lfs diff=lfs merge=lfs -text
39
+ eb1a559cd1f53e2ede74f1307030a1d0.edf filter=lfs diff=lfs merge=lfs -text
40
+ 92713480ca8937ba5a8dadead5278d92.edf filter=lfs diff=lfs merge=lfs -text
41
+ 0ff335c7ce60753ee28a910e9fab16f4.edf filter=lfs diff=lfs merge=lfs -text
0ff335c7ce60753ee28a910e9fab16f4.edf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cb70ff6f1027da1e8384576d0fd2f8726a58067c3e21c551fee8c2289df412e0
3
+ size 1222930525
8a7b90c915c1cecaf381c70594e3f955.edf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:73a6e3c4b25a50829662d10233901fbaeb610770b244d2d780743b8cc2effef5
3
+ size 1215314627
92713480ca8937ba5a8dadead5278d92.edf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eb498ec1cb2a29aa3a9951b57031dcf5813010a78ee5b764ba57308679e0a7cc
3
+ size 1218953632
97bb3cab5f2f7f5f4640c04cbf3b6ee0.edf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a33c01d1653a010f4b2f90d69a7e6f25791efcd8101ac9d1c089890a252d1f45
3
+ size 1223602936
9ad47915b97d47d3ce069c00271807d6.edf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1268b4add4f2602bdc8f941d2f049f6258b3e377dcc7aa12e8383ac2eefdc2bd
3
+ size 1219094832
LICENSE ADDED
@@ -0,0 +1,202 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ Apache License
3
+ Version 2.0, January 2004
4
+ http://www.apache.org/licenses/
5
+
6
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
7
+
8
+ 1. Definitions.
9
+
10
+ "License" shall mean the terms and conditions for use, reproduction,
11
+ and distribution as defined by Sections 1 through 9 of this document.
12
+
13
+ "Licensor" shall mean the copyright owner or entity authorized by
14
+ the copyright owner that is granting the License.
15
+
16
+ "Legal Entity" shall mean the union of the acting entity and all
17
+ other entities that control, are controlled by, or are under common
18
+ control with that entity. For the purposes of this definition,
19
+ "control" means (i) the power, direct or indirect, to cause the
20
+ direction or management of such entity, whether by contract or
21
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
22
+ outstanding shares, or (iii) beneficial ownership of such entity.
23
+
24
+ "You" (or "Your") shall mean an individual or Legal Entity
25
+ exercising permissions granted by this License.
26
+
27
+ "Source" form shall mean the preferred form for making modifications,
28
+ including but not limited to software source code, documentation
29
+ source, and configuration files.
30
+
31
+ "Object" form shall mean any form resulting from mechanical
32
+ transformation or translation of a Source form, including but
33
+ not limited to compiled object code, generated documentation,
34
+ and conversions to other media types.
35
+
36
+ "Work" shall mean the work of authorship, whether in Source or
37
+ Object form, made available under the License, as indicated by a
38
+ copyright notice that is included in or attached to the work
39
+ (an example is provided in the Appendix below).
40
+
41
+ "Derivative Works" shall mean any work, whether in Source or Object
42
+ form, that is based on (or derived from) the Work and for which the
43
+ editorial revisions, annotations, elaborations, or other modifications
44
+ represent, as a whole, an original work of authorship. For the purposes
45
+ of this License, Derivative Works shall not include works that remain
46
+ separable from, or merely link (or bind by name) to the interfaces of,
47
+ the Work and Derivative Works thereof.
48
+
49
+ "Contribution" shall mean any work of authorship, including
50
+ the original version of the Work and any modifications or additions
51
+ to that Work or Derivative Works thereof, that is intentionally
52
+ submitted to Licensor for inclusion in the Work by the copyright owner
53
+ or by an individual or Legal Entity authorized to submit on behalf of
54
+ the copyright owner. For the purposes of this definition, "submitted"
55
+ means any form of electronic, verbal, or written communication sent
56
+ to the Licensor or its representatives, including but not limited to
57
+ communication on electronic mailing lists, source code control systems,
58
+ and issue tracking systems that are managed by, or on behalf of, the
59
+ Licensor for the purpose of discussing and improving the Work, but
60
+ excluding communication that is conspicuously marked or otherwise
61
+ designated in writing by the copyright owner as "Not a Contribution."
62
+
63
+ "Contributor" shall mean Licensor and any individual or Legal Entity
64
+ on behalf of whom a Contribution has been received by Licensor and
65
+ subsequently incorporated within the Work.
66
+
67
+ 2. Grant of Copyright License. Subject to the terms and conditions of
68
+ this License, each Contributor hereby grants to You a perpetual,
69
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
70
+ copyright license to reproduce, prepare Derivative Works of,
71
+ publicly display, publicly perform, sublicense, and distribute the
72
+ Work and such Derivative Works in Source or Object form.
73
+
74
+ 3. Grant of Patent License. Subject to the terms and conditions of
75
+ this License, each Contributor hereby grants to You a perpetual,
76
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
77
+ (except as stated in this section) patent license to make, have made,
78
+ use, offer to sell, sell, import, and otherwise transfer the Work,
79
+ where such license applies only to those patent claims licensable
80
+ by such Contributor that are necessarily infringed by their
81
+ Contribution(s) alone or by combination of their Contribution(s)
82
+ with the Work to which such Contribution(s) was submitted. If You
83
+ institute patent litigation against any entity (including a
84
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
85
+ or a Contribution incorporated within the Work constitutes direct
86
+ or contributory patent infringement, then any patent licenses
87
+ granted to You under this License for that Work shall terminate
88
+ as of the date such litigation is filed.
89
+
90
+ 4. Redistribution. You may reproduce and distribute copies of the
91
+ Work or Derivative Works thereof in any medium, with or without
92
+ modifications, and in Source or Object form, provided that You
93
+ meet the following conditions:
94
+
95
+ (a) You must give any other recipients of the Work or
96
+ Derivative Works a copy of this License; and
97
+
98
+ (b) You must cause any modified files to carry prominent notices
99
+ stating that You changed the files; and
100
+
101
+ (c) You must retain, in the Source form of any Derivative Works
102
+ that You distribute, all copyright, patent, trademark, and
103
+ attribution notices from the Source form of the Work,
104
+ excluding those notices that do not pertain to any part of
105
+ the Derivative Works; and
106
+
107
+ (d) If the Work includes a "NOTICE" text file as part of its
108
+ distribution, then any Derivative Works that You distribute must
109
+ include a readable copy of the attribution notices contained
110
+ within such NOTICE file, excluding those notices that do not
111
+ pertain to any part of the Derivative Works, in at least one
112
+ of the following places: within a NOTICE text file distributed
113
+ as part of the Derivative Works; within the Source form or
114
+ documentation, if provided along with the Derivative Works; or,
115
+ within a display generated by the Derivative Works, if and
116
+ wherever such third-party notices normally appear. The contents
117
+ of the NOTICE file are for informational purposes only and
118
+ do not modify the License. You may add Your own attribution
119
+ notices within Derivative Works that You distribute, alongside
120
+ or as an addendum to the NOTICE text from the Work, provided
121
+ that such additional attribution notices cannot be construed
122
+ as modifying the License.
123
+
124
+ You may add Your own copyright statement to Your modifications and
125
+ may provide additional or different license terms and conditions
126
+ for use, reproduction, or distribution of Your modifications, or
127
+ for any such Derivative Works as a whole, provided Your use,
128
+ reproduction, and distribution of the Work otherwise complies with
129
+ the conditions stated in this License.
130
+
131
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
132
+ any Contribution intentionally submitted for inclusion in the Work
133
+ by You to the Licensor shall be under the terms and conditions of
134
+ this License, without any additional terms or conditions.
135
+ Notwithstanding the above, nothing herein shall supersede or modify
136
+ the terms of any separate license agreement you may have executed
137
+ with Licensor regarding such Contributions.
138
+
139
+ 6. Trademarks. This License does not grant permission to use the trade
140
+ names, trademarks, service marks, or product names of the Licensor,
141
+ except as required for reasonable and customary use in describing the
142
+ origin of the Work and reproducing the content of the NOTICE file.
143
+
144
+ 7. Disclaimer of Warranty. Unless required by applicable law or
145
+ agreed to in writing, Licensor provides the Work (and each
146
+ Contributor provides its Contributions) on an "AS IS" BASIS,
147
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148
+ implied, including, without limitation, any warranties or conditions
149
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150
+ PARTICULAR PURPOSE. You are solely responsible for determining the
151
+ appropriateness of using or redistributing the Work and assume any
152
+ risks associated with Your exercise of permissions under this License.
153
+
154
+ 8. Limitation of Liability. In no event and under no legal theory,
155
+ whether in tort (including negligence), contract, or otherwise,
156
+ unless required by applicable law (such as deliberate and grossly
157
+ negligent acts) or agreed to in writing, shall any Contributor be
158
+ liable to You for damages, including any direct, indirect, special,
159
+ incidental, or consequential damages of any character arising as a
160
+ result of this License or out of the use or inability to use the
161
+ Work (including but not limited to damages for loss of goodwill,
162
+ work stoppage, computer failure or malfunction, or any and all
163
+ other commercial damages or losses), even if such Contributor
164
+ has been advised of the possibility of such damages.
165
+
166
+ 9. Accepting Warranty or Additional Liability. While redistributing
167
+ the Work or Derivative Works thereof, You may choose to offer,
168
+ and charge a fee for, acceptance of support, warranty, indemnity,
169
+ or other liability obligations and/or rights consistent with this
170
+ License. However, in accepting such obligations, You may act only
171
+ on Your own behalf and on Your sole responsibility, not on behalf
172
+ of any other Contributor, and only if You agree to indemnify,
173
+ defend, and hold each Contributor harmless for any liability
174
+ incurred by, or claims asserted against, such Contributor by reason
175
+ of your accepting any such warranty or additional liability.
176
+
177
+ END OF TERMS AND CONDITIONS
178
+
179
+ APPENDIX: How to apply the Apache License to your work.
180
+
181
+ To apply the Apache License to your work, attach the following
182
+ boilerplate notice, with the fields enclosed by brackets "[]"
183
+ replaced with your own identifying information. (Don't include
184
+ the brackets!) The text should be enclosed in the appropriate
185
+ comment syntax for the file format. We also recommend that a
186
+ file or class name and description of purpose be included on the
187
+ same "printed page" as the copyright notice for easier
188
+ identification within third-party archives.
189
+
190
+ Copyright [yyyy] [name of copyright owner]
191
+
192
+ Licensed under the Apache License, Version 2.0 (the "License");
193
+ you may not use this file except in compliance with the License.
194
+ You may obtain a copy of the License at
195
+
196
+ http://www.apache.org/licenses/LICENSE-2.0
197
+
198
+ Unless required by applicable law or agreed to in writing, software
199
+ distributed under the License is distributed on an "AS IS" BASIS,
200
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
201
+ See the License for the specific language governing permissions and
202
+ limitations under the License.
README.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: google-bert/bert-large-uncased
3
+ license: apache-2.0
4
+ pipeline_tag: question-answering
5
+ library_name: furiosa-llm
6
+ tags:
7
+ - furiosa-ai
8
+ ---
9
+ # Model Overview
10
+ - **Model Architecture:** Bert
11
+ - **Input:** Text
12
+ - **Output:** Text
13
+ - **Model Optimizations:**
14
+ - **Maximum Context Length:** 384 tokens
15
+ - **Intended Use Cases:** Intended for commercial and non-commercial use. Same as [google/bert-large-uncase](https://huggingface.co/google-bert/bert-large-uncased), this models is intended for question-answering.
16
+ - **Release Date:** 04/12/2025
17
+ - **Version:** v2025.2
18
+ - **License(s):** [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md)
19
+ - **Supported Inference Engine(s):** Furiosa LLM
20
+ - **Supported Hardware Compatibility:** FuriosaAI RNGD
21
+ - **Preferred Operating System(s):** Linux
22
+ - **Quantization:**
23
+ - Tool: Furiosa Model Compressor v0.6.2, included in Furiosa SDK 2025.2
24
+ - Weight: int8, Activation: int8, KV cache: int8
25
+ - Calibration: [SQuAD v1.1 dataset](https://rajpurkar.github.io/SQuAD-explorer/) ([instruction](https://zenodo.org/records/4792496)), [100 samples](https://github.com/mlcommons/inference/blob/master/calibration/SQuAD-v1.1/bert_calibration_features.txt)
26
+
27
+
28
+ ## Description:
29
+ This model is the pre-compiled version of the [google/bert-large-uncase](https://huggingface.co/google-bert/bert-large-uncased), which is an embedding model that uses an optimized transformer architecture.
30
+
31
+ ## Usage
32
+
33
+ ### MLPerf Benchmark using RNGD
34
+ Follow the example command below after [installing furiosa-mlperf and its prerequisites](https://developer.furiosa.ai/latest/en/getting_started/furiosa_mlperf.html).
35
+
36
+ ```sh
37
+ furiosa-mlperf bert-offline furiosa-ai/bert-large-uncased-INT8-MLPerf ./mlperf-result
38
+ ```
artifact.json ADDED
@@ -0,0 +1,1581 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "artifact_id": "d86d900b-d7f7-4838-9727-35ca1b0d4ec4",
4
+ "name": "mlperf-bert",
5
+ "timestamp": 1745456449,
6
+ "furiosa_llm_version": "249c6f1",
7
+ "furiosa_compiler_version": "b504d5d48"
8
+ },
9
+ "model": {
10
+ "generator_config": {
11
+ "position_id_pad": 1,
12
+ "buckets": [
13
+ {
14
+ "batch_size": 1,
15
+ "attention_size": 384,
16
+ "kv_cache_size": 0
17
+ },
18
+ {
19
+ "batch_size": 1,
20
+ "attention_size": 320,
21
+ "kv_cache_size": 0
22
+ },
23
+ {
24
+ "batch_size": 1,
25
+ "attention_size": 192,
26
+ "kv_cache_size": 0
27
+ },
28
+ {
29
+ "batch_size": 1,
30
+ "attention_size": 128,
31
+ "kv_cache_size": 0
32
+ },
33
+ {
34
+ "batch_size": 1,
35
+ "attention_size": 160,
36
+ "kv_cache_size": 0
37
+ },
38
+ {
39
+ "batch_size": 2,
40
+ "attention_size": 96,
41
+ "kv_cache_size": 0
42
+ }
43
+ ],
44
+ "model_qname": "furiosa_llm_models.bert.symbolic.mlperf_submission.BertForQuestionAnswering",
45
+ "paged_attention_config": null,
46
+ "packing_type": "IDENTITY",
47
+ "kv_cache_sharing_across_beams_config": null,
48
+ "num_speculative_tokens": null,
49
+ "unpadded_vocab_size": null
50
+ },
51
+ "hf_config": {
52
+ "return_dict": true,
53
+ "output_hidden_states": false,
54
+ "output_attentions": false,
55
+ "torchscript": false,
56
+ "torch_dtype": "float32",
57
+ "use_bfloat16": false,
58
+ "tf_legacy_loss": false,
59
+ "pruned_heads": {},
60
+ "tie_word_embeddings": true,
61
+ "chunk_size_feed_forward": 0,
62
+ "is_encoder_decoder": false,
63
+ "is_decoder": false,
64
+ "cross_attention_hidden_size": null,
65
+ "add_cross_attention": false,
66
+ "tie_encoder_decoder": false,
67
+ "max_length": 20,
68
+ "min_length": 0,
69
+ "do_sample": false,
70
+ "early_stopping": false,
71
+ "num_beams": 1,
72
+ "num_beam_groups": 1,
73
+ "diversity_penalty": 0.0,
74
+ "temperature": 1.0,
75
+ "top_k": 50,
76
+ "top_p": 1.0,
77
+ "typical_p": 1.0,
78
+ "repetition_penalty": 1.0,
79
+ "length_penalty": 1.0,
80
+ "no_repeat_ngram_size": 0,
81
+ "encoder_no_repeat_ngram_size": 0,
82
+ "bad_words_ids": null,
83
+ "num_return_sequences": 1,
84
+ "output_scores": false,
85
+ "return_dict_in_generate": false,
86
+ "forced_bos_token_id": null,
87
+ "forced_eos_token_id": null,
88
+ "remove_invalid_values": false,
89
+ "exponential_decay_length_penalty": null,
90
+ "suppress_tokens": null,
91
+ "begin_suppress_tokens": null,
92
+ "architectures": [
93
+ "BertForQuestionAnswering"
94
+ ],
95
+ "finetuning_task": null,
96
+ "id2label": {
97
+ "0": "LABEL_0",
98
+ "1": "LABEL_1"
99
+ },
100
+ "label2id": {
101
+ "LABEL_0": 0,
102
+ "LABEL_1": 1
103
+ },
104
+ "tokenizer_class": null,
105
+ "prefix": null,
106
+ "bos_token_id": null,
107
+ "pad_token_id": 0,
108
+ "eos_token_id": null,
109
+ "sep_token_id": null,
110
+ "decoder_start_token_id": null,
111
+ "task_specific_params": null,
112
+ "problem_type": null,
113
+ "_name_or_path": "furiosa-ai/mlperf-bert-large",
114
+ "_attn_implementation_autoset": false,
115
+ "transformers_version": "4.48.1",
116
+ "model_type": "bert",
117
+ "vocab_size": 30522,
118
+ "hidden_size": 1024,
119
+ "num_hidden_layers": 24,
120
+ "num_attention_heads": 16,
121
+ "hidden_act": "rngd_gelu",
122
+ "intermediate_size": 4096,
123
+ "hidden_dropout_prob": 0.1,
124
+ "attention_probs_dropout_prob": 0.1,
125
+ "max_position_embeddings": 512,
126
+ "type_vocab_size": 2,
127
+ "initializer_range": 0.02,
128
+ "layer_norm_eps": 1e-12,
129
+ "position_embedding_type": "absolute",
130
+ "use_cache": true,
131
+ "classifier_dropout": null
132
+ },
133
+ "model_metadata": {
134
+ "pretrained_id": "furiosa-ai/mlperf-bert-large",
135
+ "task_type": "question-answering",
136
+ "llm_config": {
137
+ "optimization_config": {
138
+ "attention_type": "VANILLA",
139
+ "optimize_rope": false,
140
+ "optimize_packed": false,
141
+ "decompose_layernorm": false,
142
+ "optimize_furiosa": false,
143
+ "use_unsplit_packed": true,
144
+ "compact_causal_mask": false,
145
+ "use_rngd_gelu": true,
146
+ "causal_mask_free_decoding": false,
147
+ "kv_cache_sharing_across_beams": false,
148
+ "inbound_beamsearch_softmax": false,
149
+ "calculate_logit_only_for_last_token": false,
150
+ "optimized_for_speculative_decoding": false
151
+ },
152
+ "quantization_config": {
153
+ "weight": "int8",
154
+ "activation": "int8",
155
+ "kv_cache": "int8",
156
+ "use_mcp": true
157
+ }
158
+ },
159
+ "hf_configs": {},
160
+ "model_weight_path": null,
161
+ "trust_remote_code": null,
162
+ "allow_bfloat16_cast_with_mcp": true,
163
+ "auto_bfloat16_cast": null
164
+ },
165
+ "model_rewriting_config": {
166
+ "do_decompositions_for_model_rewrite": false,
167
+ "use_blockwise_compile": true,
168
+ "embedding_layer_as_single_block": false,
169
+ "num_blocks_per_supertask": 24,
170
+ "embed_all_constants_into_graph": true,
171
+ "optimize_logit_shape": true
172
+ },
173
+ "parallel_config": {
174
+ "tensor_parallel_size": 1,
175
+ "pipeline_parallel_size": 1
176
+ },
177
+ "pipelines": [
178
+ {
179
+ "name": "Quantized_furiosa_llm_models.bert.symbolic.mlperf_submission.BertForQuestionAnswering-kv0-b1-attn384",
180
+ "devices": {
181
+ "0": "npu:0:0"
182
+ },
183
+ "tensors": {
184
+ "d0_arg0_1": {
185
+ "shape": [
186
+ 1,
187
+ 384
188
+ ],
189
+ "dtype": "i32"
190
+ },
191
+ "d0_arg1_1": {
192
+ "shape": [
193
+ 1,
194
+ 384
195
+ ],
196
+ "dtype": "i32"
197
+ },
198
+ "d0_arg2_1": {
199
+ "shape": [
200
+ 1,
201
+ 384,
202
+ 384
203
+ ],
204
+ "dtype": "bool"
205
+ },
206
+ "d0_arg3_1": {
207
+ "shape": [
208
+ 1,
209
+ 384
210
+ ],
211
+ "dtype": "i32"
212
+ },
213
+ "submod_d0_c0": {
214
+ "shape": [
215
+ 1,
216
+ 384,
217
+ 2
218
+ ],
219
+ "dtype": "f32"
220
+ }
221
+ },
222
+ "supertasks": {
223
+ "0": {
224
+ "kind": "input",
225
+ "inputs": [],
226
+ "outputs": [
227
+ "d0_arg0_1",
228
+ "d0_arg1_1",
229
+ "d0_arg2_1",
230
+ "d0_arg3_1"
231
+ ]
232
+ },
233
+ "1": {
234
+ "kind": "output",
235
+ "inputs": [
236
+ "submod_d0_c0"
237
+ ],
238
+ "outputs": []
239
+ },
240
+ "2": {
241
+ "kind": "edf",
242
+ "inputs": [
243
+ "d0_arg2_1",
244
+ "d0_arg0_1",
245
+ "d0_arg1_1",
246
+ "d0_arg3_1"
247
+ ],
248
+ "outputs": [
249
+ "submod_d0_c0"
250
+ ],
251
+ "device": "0",
252
+ "data": null,
253
+ "data_blob": "92713480ca8937ba5a8dadead5278d92"
254
+ }
255
+ },
256
+ "metadata": {
257
+ "tensors": {
258
+ "inputs": {
259
+ "input_ids": {
260
+ "shape": [
261
+ 1,
262
+ 384
263
+ ],
264
+ "dtype": "i32",
265
+ "idx": 0
266
+ },
267
+ "token_type_ids": {
268
+ "shape": [
269
+ 1,
270
+ 384
271
+ ],
272
+ "dtype": "i32",
273
+ "idx": 1
274
+ },
275
+ "attention_mask": {
276
+ "shape": [
277
+ 1,
278
+ 384,
279
+ 384
280
+ ],
281
+ "dtype": "bool",
282
+ "idx": 2
283
+ },
284
+ "position_ids": {
285
+ "shape": [
286
+ 1,
287
+ 384
288
+ ],
289
+ "dtype": "i32",
290
+ "idx": 3
291
+ }
292
+ },
293
+ "outputs": {
294
+ "logits": {
295
+ "shape": [
296
+ 1,
297
+ 384,
298
+ 2
299
+ ],
300
+ "dtype": "f32",
301
+ "idx": 0
302
+ }
303
+ }
304
+ },
305
+ "tensor_slices": {
306
+ "inputs": {
307
+ "d0_arg0_1": {
308
+ "placements": [
309
+ [
310
+ 0,
311
+ 1
312
+ ],
313
+ [
314
+ 0,
315
+ 384
316
+ ]
317
+ ],
318
+ "origin": "input_ids",
319
+ "dtype": "i32",
320
+ "device": "0"
321
+ },
322
+ "d0_arg1_1": {
323
+ "placements": [
324
+ [
325
+ 0,
326
+ 1
327
+ ],
328
+ [
329
+ 0,
330
+ 384
331
+ ]
332
+ ],
333
+ "origin": "token_type_ids",
334
+ "dtype": "i32",
335
+ "device": "0"
336
+ },
337
+ "d0_arg2_1": {
338
+ "placements": [
339
+ [
340
+ 0,
341
+ 1
342
+ ],
343
+ [
344
+ 0,
345
+ 384
346
+ ],
347
+ [
348
+ 0,
349
+ 384
350
+ ]
351
+ ],
352
+ "origin": "attention_mask",
353
+ "dtype": "bool",
354
+ "device": "0"
355
+ },
356
+ "d0_arg3_1": {
357
+ "placements": [
358
+ [
359
+ 0,
360
+ 1
361
+ ],
362
+ [
363
+ 0,
364
+ 384
365
+ ]
366
+ ],
367
+ "origin": "position_ids",
368
+ "dtype": "i32",
369
+ "device": "0"
370
+ }
371
+ },
372
+ "outputs": {
373
+ "submod_d0_c0": {
374
+ "placements": [
375
+ [
376
+ 0,
377
+ 1
378
+ ],
379
+ [
380
+ 0,
381
+ 384
382
+ ],
383
+ [
384
+ 0,
385
+ 2
386
+ ]
387
+ ],
388
+ "origin": "logits",
389
+ "dtype": "f32",
390
+ "device": "0"
391
+ }
392
+ }
393
+ }
394
+ },
395
+ "blobs": {
396
+ "92713480ca8937ba5a8dadead5278d92": null
397
+ },
398
+ "param_files": {
399
+ "0": {
400
+ "path": "params-mlperf-bert-large-mlperf_submission-24L-W8A8KV8-allow_bfloat16_cast_with_mcp-ba480aa7f239d5bf87fdd9b369ce396c7f516f5fcecf3f40000671d6299f6f5c.safetensors",
401
+ "format": "safetensors"
402
+ }
403
+ },
404
+ "device_constraints": [],
405
+ "version": "0.1.0"
406
+ },
407
+ {
408
+ "name": "Quantized_furiosa_llm_models.bert.symbolic.mlperf_submission.BertForQuestionAnswering-kv0-b1-attn320",
409
+ "devices": {
410
+ "0": "npu:0:0"
411
+ },
412
+ "tensors": {
413
+ "d0_arg0_1": {
414
+ "shape": [
415
+ 1,
416
+ 320
417
+ ],
418
+ "dtype": "i32"
419
+ },
420
+ "d0_arg1_1": {
421
+ "shape": [
422
+ 1,
423
+ 320
424
+ ],
425
+ "dtype": "i32"
426
+ },
427
+ "d0_arg2_1": {
428
+ "shape": [
429
+ 1,
430
+ 320,
431
+ 320
432
+ ],
433
+ "dtype": "bool"
434
+ },
435
+ "d0_arg3_1": {
436
+ "shape": [
437
+ 1,
438
+ 320
439
+ ],
440
+ "dtype": "i32"
441
+ },
442
+ "submod_d0_c0": {
443
+ "shape": [
444
+ 1,
445
+ 320,
446
+ 2
447
+ ],
448
+ "dtype": "f32"
449
+ }
450
+ },
451
+ "supertasks": {
452
+ "0": {
453
+ "kind": "input",
454
+ "inputs": [],
455
+ "outputs": [
456
+ "d0_arg0_1",
457
+ "d0_arg1_1",
458
+ "d0_arg2_1",
459
+ "d0_arg3_1"
460
+ ]
461
+ },
462
+ "1": {
463
+ "kind": "output",
464
+ "inputs": [
465
+ "submod_d0_c0"
466
+ ],
467
+ "outputs": []
468
+ },
469
+ "2": {
470
+ "kind": "edf",
471
+ "inputs": [
472
+ "d0_arg2_1",
473
+ "d0_arg0_1",
474
+ "d0_arg1_1",
475
+ "d0_arg3_1"
476
+ ],
477
+ "outputs": [
478
+ "submod_d0_c0"
479
+ ],
480
+ "device": "0",
481
+ "data": null,
482
+ "data_blob": "0ff335c7ce60753ee28a910e9fab16f4"
483
+ }
484
+ },
485
+ "metadata": {
486
+ "tensors": {
487
+ "inputs": {
488
+ "input_ids": {
489
+ "shape": [
490
+ 1,
491
+ 320
492
+ ],
493
+ "dtype": "i32",
494
+ "idx": 0
495
+ },
496
+ "token_type_ids": {
497
+ "shape": [
498
+ 1,
499
+ 320
500
+ ],
501
+ "dtype": "i32",
502
+ "idx": 1
503
+ },
504
+ "attention_mask": {
505
+ "shape": [
506
+ 1,
507
+ 320,
508
+ 320
509
+ ],
510
+ "dtype": "bool",
511
+ "idx": 2
512
+ },
513
+ "position_ids": {
514
+ "shape": [
515
+ 1,
516
+ 320
517
+ ],
518
+ "dtype": "i32",
519
+ "idx": 3
520
+ }
521
+ },
522
+ "outputs": {
523
+ "logits": {
524
+ "shape": [
525
+ 1,
526
+ 320,
527
+ 2
528
+ ],
529
+ "dtype": "f32",
530
+ "idx": 0
531
+ }
532
+ }
533
+ },
534
+ "tensor_slices": {
535
+ "inputs": {
536
+ "d0_arg0_1": {
537
+ "placements": [
538
+ [
539
+ 0,
540
+ 1
541
+ ],
542
+ [
543
+ 0,
544
+ 320
545
+ ]
546
+ ],
547
+ "origin": "input_ids",
548
+ "dtype": "i32",
549
+ "device": "0"
550
+ },
551
+ "d0_arg1_1": {
552
+ "placements": [
553
+ [
554
+ 0,
555
+ 1
556
+ ],
557
+ [
558
+ 0,
559
+ 320
560
+ ]
561
+ ],
562
+ "origin": "token_type_ids",
563
+ "dtype": "i32",
564
+ "device": "0"
565
+ },
566
+ "d0_arg2_1": {
567
+ "placements": [
568
+ [
569
+ 0,
570
+ 1
571
+ ],
572
+ [
573
+ 0,
574
+ 320
575
+ ],
576
+ [
577
+ 0,
578
+ 320
579
+ ]
580
+ ],
581
+ "origin": "attention_mask",
582
+ "dtype": "bool",
583
+ "device": "0"
584
+ },
585
+ "d0_arg3_1": {
586
+ "placements": [
587
+ [
588
+ 0,
589
+ 1
590
+ ],
591
+ [
592
+ 0,
593
+ 320
594
+ ]
595
+ ],
596
+ "origin": "position_ids",
597
+ "dtype": "i32",
598
+ "device": "0"
599
+ }
600
+ },
601
+ "outputs": {
602
+ "submod_d0_c0": {
603
+ "placements": [
604
+ [
605
+ 0,
606
+ 1
607
+ ],
608
+ [
609
+ 0,
610
+ 320
611
+ ],
612
+ [
613
+ 0,
614
+ 2
615
+ ]
616
+ ],
617
+ "origin": "logits",
618
+ "dtype": "f32",
619
+ "device": "0"
620
+ }
621
+ }
622
+ }
623
+ },
624
+ "blobs": {
625
+ "0ff335c7ce60753ee28a910e9fab16f4": null
626
+ },
627
+ "param_files": {
628
+ "0": {
629
+ "path": "params-mlperf-bert-large-mlperf_submission-24L-W8A8KV8-allow_bfloat16_cast_with_mcp-ba480aa7f239d5bf87fdd9b369ce396c7f516f5fcecf3f40000671d6299f6f5c.safetensors",
630
+ "format": "safetensors"
631
+ }
632
+ },
633
+ "device_constraints": [],
634
+ "version": "0.1.0"
635
+ },
636
+ {
637
+ "name": "Quantized_furiosa_llm_models.bert.symbolic.mlperf_submission.BertForQuestionAnswering-kv0-b1-attn192",
638
+ "devices": {
639
+ "0": "npu:0:0"
640
+ },
641
+ "tensors": {
642
+ "d0_arg0_1": {
643
+ "shape": [
644
+ 1,
645
+ 192
646
+ ],
647
+ "dtype": "i32"
648
+ },
649
+ "d0_arg1_1": {
650
+ "shape": [
651
+ 1,
652
+ 192
653
+ ],
654
+ "dtype": "i32"
655
+ },
656
+ "d0_arg2_1": {
657
+ "shape": [
658
+ 1,
659
+ 192,
660
+ 192
661
+ ],
662
+ "dtype": "bool"
663
+ },
664
+ "d0_arg3_1": {
665
+ "shape": [
666
+ 1,
667
+ 192
668
+ ],
669
+ "dtype": "i32"
670
+ },
671
+ "submod_d0_c0": {
672
+ "shape": [
673
+ 1,
674
+ 192,
675
+ 2
676
+ ],
677
+ "dtype": "f32"
678
+ }
679
+ },
680
+ "supertasks": {
681
+ "0": {
682
+ "kind": "input",
683
+ "inputs": [],
684
+ "outputs": [
685
+ "d0_arg0_1",
686
+ "d0_arg1_1",
687
+ "d0_arg2_1",
688
+ "d0_arg3_1"
689
+ ]
690
+ },
691
+ "1": {
692
+ "kind": "output",
693
+ "inputs": [
694
+ "submod_d0_c0"
695
+ ],
696
+ "outputs": []
697
+ },
698
+ "2": {
699
+ "kind": "edf",
700
+ "inputs": [
701
+ "d0_arg2_1",
702
+ "d0_arg0_1",
703
+ "d0_arg1_1",
704
+ "d0_arg3_1"
705
+ ],
706
+ "outputs": [
707
+ "submod_d0_c0"
708
+ ],
709
+ "device": "0",
710
+ "data": null,
711
+ "data_blob": "eb1a559cd1f53e2ede74f1307030a1d0"
712
+ }
713
+ },
714
+ "metadata": {
715
+ "tensors": {
716
+ "inputs": {
717
+ "input_ids": {
718
+ "shape": [
719
+ 1,
720
+ 192
721
+ ],
722
+ "dtype": "i32",
723
+ "idx": 0
724
+ },
725
+ "token_type_ids": {
726
+ "shape": [
727
+ 1,
728
+ 192
729
+ ],
730
+ "dtype": "i32",
731
+ "idx": 1
732
+ },
733
+ "attention_mask": {
734
+ "shape": [
735
+ 1,
736
+ 192,
737
+ 192
738
+ ],
739
+ "dtype": "bool",
740
+ "idx": 2
741
+ },
742
+ "position_ids": {
743
+ "shape": [
744
+ 1,
745
+ 192
746
+ ],
747
+ "dtype": "i32",
748
+ "idx": 3
749
+ }
750
+ },
751
+ "outputs": {
752
+ "logits": {
753
+ "shape": [
754
+ 1,
755
+ 192,
756
+ 2
757
+ ],
758
+ "dtype": "f32",
759
+ "idx": 0
760
+ }
761
+ }
762
+ },
763
+ "tensor_slices": {
764
+ "inputs": {
765
+ "d0_arg0_1": {
766
+ "placements": [
767
+ [
768
+ 0,
769
+ 1
770
+ ],
771
+ [
772
+ 0,
773
+ 192
774
+ ]
775
+ ],
776
+ "origin": "input_ids",
777
+ "dtype": "i32",
778
+ "device": "0"
779
+ },
780
+ "d0_arg1_1": {
781
+ "placements": [
782
+ [
783
+ 0,
784
+ 1
785
+ ],
786
+ [
787
+ 0,
788
+ 192
789
+ ]
790
+ ],
791
+ "origin": "token_type_ids",
792
+ "dtype": "i32",
793
+ "device": "0"
794
+ },
795
+ "d0_arg2_1": {
796
+ "placements": [
797
+ [
798
+ 0,
799
+ 1
800
+ ],
801
+ [
802
+ 0,
803
+ 192
804
+ ],
805
+ [
806
+ 0,
807
+ 192
808
+ ]
809
+ ],
810
+ "origin": "attention_mask",
811
+ "dtype": "bool",
812
+ "device": "0"
813
+ },
814
+ "d0_arg3_1": {
815
+ "placements": [
816
+ [
817
+ 0,
818
+ 1
819
+ ],
820
+ [
821
+ 0,
822
+ 192
823
+ ]
824
+ ],
825
+ "origin": "position_ids",
826
+ "dtype": "i32",
827
+ "device": "0"
828
+ }
829
+ },
830
+ "outputs": {
831
+ "submod_d0_c0": {
832
+ "placements": [
833
+ [
834
+ 0,
835
+ 1
836
+ ],
837
+ [
838
+ 0,
839
+ 192
840
+ ],
841
+ [
842
+ 0,
843
+ 2
844
+ ]
845
+ ],
846
+ "origin": "logits",
847
+ "dtype": "f32",
848
+ "device": "0"
849
+ }
850
+ }
851
+ }
852
+ },
853
+ "blobs": {
854
+ "eb1a559cd1f53e2ede74f1307030a1d0": null
855
+ },
856
+ "param_files": {
857
+ "0": {
858
+ "path": "params-mlperf-bert-large-mlperf_submission-24L-W8A8KV8-allow_bfloat16_cast_with_mcp-ba480aa7f239d5bf87fdd9b369ce396c7f516f5fcecf3f40000671d6299f6f5c.safetensors",
859
+ "format": "safetensors"
860
+ }
861
+ },
862
+ "device_constraints": [],
863
+ "version": "0.1.0"
864
+ },
865
+ {
866
+ "name": "Quantized_furiosa_llm_models.bert.symbolic.mlperf_submission.BertForQuestionAnswering-kv0-b1-attn128",
867
+ "devices": {
868
+ "0": "npu:0:0"
869
+ },
870
+ "tensors": {
871
+ "d0_arg0_1": {
872
+ "shape": [
873
+ 1,
874
+ 128
875
+ ],
876
+ "dtype": "i32"
877
+ },
878
+ "d0_arg1_1": {
879
+ "shape": [
880
+ 1,
881
+ 128
882
+ ],
883
+ "dtype": "i32"
884
+ },
885
+ "d0_arg2_1": {
886
+ "shape": [
887
+ 1,
888
+ 128,
889
+ 128
890
+ ],
891
+ "dtype": "bool"
892
+ },
893
+ "d0_arg3_1": {
894
+ "shape": [
895
+ 1,
896
+ 128
897
+ ],
898
+ "dtype": "i32"
899
+ },
900
+ "submod_d0_c0": {
901
+ "shape": [
902
+ 1,
903
+ 128,
904
+ 2
905
+ ],
906
+ "dtype": "f32"
907
+ }
908
+ },
909
+ "supertasks": {
910
+ "0": {
911
+ "kind": "input",
912
+ "inputs": [],
913
+ "outputs": [
914
+ "d0_arg0_1",
915
+ "d0_arg1_1",
916
+ "d0_arg2_1",
917
+ "d0_arg3_1"
918
+ ]
919
+ },
920
+ "1": {
921
+ "kind": "output",
922
+ "inputs": [
923
+ "submod_d0_c0"
924
+ ],
925
+ "outputs": []
926
+ },
927
+ "2": {
928
+ "kind": "edf",
929
+ "inputs": [
930
+ "d0_arg2_1",
931
+ "d0_arg0_1",
932
+ "d0_arg1_1",
933
+ "d0_arg3_1"
934
+ ],
935
+ "outputs": [
936
+ "submod_d0_c0"
937
+ ],
938
+ "device": "0",
939
+ "data": null,
940
+ "data_blob": "9ad47915b97d47d3ce069c00271807d6"
941
+ }
942
+ },
943
+ "metadata": {
944
+ "tensors": {
945
+ "inputs": {
946
+ "input_ids": {
947
+ "shape": [
948
+ 1,
949
+ 128
950
+ ],
951
+ "dtype": "i32",
952
+ "idx": 0
953
+ },
954
+ "token_type_ids": {
955
+ "shape": [
956
+ 1,
957
+ 128
958
+ ],
959
+ "dtype": "i32",
960
+ "idx": 1
961
+ },
962
+ "attention_mask": {
963
+ "shape": [
964
+ 1,
965
+ 128,
966
+ 128
967
+ ],
968
+ "dtype": "bool",
969
+ "idx": 2
970
+ },
971
+ "position_ids": {
972
+ "shape": [
973
+ 1,
974
+ 128
975
+ ],
976
+ "dtype": "i32",
977
+ "idx": 3
978
+ }
979
+ },
980
+ "outputs": {
981
+ "logits": {
982
+ "shape": [
983
+ 1,
984
+ 128,
985
+ 2
986
+ ],
987
+ "dtype": "f32",
988
+ "idx": 0
989
+ }
990
+ }
991
+ },
992
+ "tensor_slices": {
993
+ "inputs": {
994
+ "d0_arg0_1": {
995
+ "placements": [
996
+ [
997
+ 0,
998
+ 1
999
+ ],
1000
+ [
1001
+ 0,
1002
+ 128
1003
+ ]
1004
+ ],
1005
+ "origin": "input_ids",
1006
+ "dtype": "i32",
1007
+ "device": "0"
1008
+ },
1009
+ "d0_arg1_1": {
1010
+ "placements": [
1011
+ [
1012
+ 0,
1013
+ 1
1014
+ ],
1015
+ [
1016
+ 0,
1017
+ 128
1018
+ ]
1019
+ ],
1020
+ "origin": "token_type_ids",
1021
+ "dtype": "i32",
1022
+ "device": "0"
1023
+ },
1024
+ "d0_arg2_1": {
1025
+ "placements": [
1026
+ [
1027
+ 0,
1028
+ 1
1029
+ ],
1030
+ [
1031
+ 0,
1032
+ 128
1033
+ ],
1034
+ [
1035
+ 0,
1036
+ 128
1037
+ ]
1038
+ ],
1039
+ "origin": "attention_mask",
1040
+ "dtype": "bool",
1041
+ "device": "0"
1042
+ },
1043
+ "d0_arg3_1": {
1044
+ "placements": [
1045
+ [
1046
+ 0,
1047
+ 1
1048
+ ],
1049
+ [
1050
+ 0,
1051
+ 128
1052
+ ]
1053
+ ],
1054
+ "origin": "position_ids",
1055
+ "dtype": "i32",
1056
+ "device": "0"
1057
+ }
1058
+ },
1059
+ "outputs": {
1060
+ "submod_d0_c0": {
1061
+ "placements": [
1062
+ [
1063
+ 0,
1064
+ 1
1065
+ ],
1066
+ [
1067
+ 0,
1068
+ 128
1069
+ ],
1070
+ [
1071
+ 0,
1072
+ 2
1073
+ ]
1074
+ ],
1075
+ "origin": "logits",
1076
+ "dtype": "f32",
1077
+ "device": "0"
1078
+ }
1079
+ }
1080
+ }
1081
+ },
1082
+ "blobs": {
1083
+ "9ad47915b97d47d3ce069c00271807d6": null
1084
+ },
1085
+ "param_files": {
1086
+ "0": {
1087
+ "path": "params-mlperf-bert-large-mlperf_submission-24L-W8A8KV8-allow_bfloat16_cast_with_mcp-ba480aa7f239d5bf87fdd9b369ce396c7f516f5fcecf3f40000671d6299f6f5c.safetensors",
1088
+ "format": "safetensors"
1089
+ }
1090
+ },
1091
+ "device_constraints": [],
1092
+ "version": "0.1.0"
1093
+ },
1094
+ {
1095
+ "name": "Quantized_furiosa_llm_models.bert.symbolic.mlperf_submission.BertForQuestionAnswering-kv0-b1-attn160",
1096
+ "devices": {
1097
+ "0": "npu:0:0"
1098
+ },
1099
+ "tensors": {
1100
+ "d0_arg0_1": {
1101
+ "shape": [
1102
+ 1,
1103
+ 160
1104
+ ],
1105
+ "dtype": "i32"
1106
+ },
1107
+ "d0_arg1_1": {
1108
+ "shape": [
1109
+ 1,
1110
+ 160
1111
+ ],
1112
+ "dtype": "i32"
1113
+ },
1114
+ "d0_arg2_1": {
1115
+ "shape": [
1116
+ 1,
1117
+ 160,
1118
+ 160
1119
+ ],
1120
+ "dtype": "bool"
1121
+ },
1122
+ "d0_arg3_1": {
1123
+ "shape": [
1124
+ 1,
1125
+ 160
1126
+ ],
1127
+ "dtype": "i32"
1128
+ },
1129
+ "submod_d0_c0": {
1130
+ "shape": [
1131
+ 1,
1132
+ 160,
1133
+ 2
1134
+ ],
1135
+ "dtype": "f32"
1136
+ }
1137
+ },
1138
+ "supertasks": {
1139
+ "0": {
1140
+ "kind": "input",
1141
+ "inputs": [],
1142
+ "outputs": [
1143
+ "d0_arg0_1",
1144
+ "d0_arg1_1",
1145
+ "d0_arg2_1",
1146
+ "d0_arg3_1"
1147
+ ]
1148
+ },
1149
+ "1": {
1150
+ "kind": "output",
1151
+ "inputs": [
1152
+ "submod_d0_c0"
1153
+ ],
1154
+ "outputs": []
1155
+ },
1156
+ "2": {
1157
+ "kind": "edf",
1158
+ "inputs": [
1159
+ "d0_arg2_1",
1160
+ "d0_arg0_1",
1161
+ "d0_arg1_1",
1162
+ "d0_arg3_1"
1163
+ ],
1164
+ "outputs": [
1165
+ "submod_d0_c0"
1166
+ ],
1167
+ "device": "0",
1168
+ "data": null,
1169
+ "data_blob": "8a7b90c915c1cecaf381c70594e3f955"
1170
+ }
1171
+ },
1172
+ "metadata": {
1173
+ "tensors": {
1174
+ "inputs": {
1175
+ "input_ids": {
1176
+ "shape": [
1177
+ 1,
1178
+ 160
1179
+ ],
1180
+ "dtype": "i32",
1181
+ "idx": 0
1182
+ },
1183
+ "token_type_ids": {
1184
+ "shape": [
1185
+ 1,
1186
+ 160
1187
+ ],
1188
+ "dtype": "i32",
1189
+ "idx": 1
1190
+ },
1191
+ "attention_mask": {
1192
+ "shape": [
1193
+ 1,
1194
+ 160,
1195
+ 160
1196
+ ],
1197
+ "dtype": "bool",
1198
+ "idx": 2
1199
+ },
1200
+ "position_ids": {
1201
+ "shape": [
1202
+ 1,
1203
+ 160
1204
+ ],
1205
+ "dtype": "i32",
1206
+ "idx": 3
1207
+ }
1208
+ },
1209
+ "outputs": {
1210
+ "logits": {
1211
+ "shape": [
1212
+ 1,
1213
+ 160,
1214
+ 2
1215
+ ],
1216
+ "dtype": "f32",
1217
+ "idx": 0
1218
+ }
1219
+ }
1220
+ },
1221
+ "tensor_slices": {
1222
+ "inputs": {
1223
+ "d0_arg0_1": {
1224
+ "placements": [
1225
+ [
1226
+ 0,
1227
+ 1
1228
+ ],
1229
+ [
1230
+ 0,
1231
+ 160
1232
+ ]
1233
+ ],
1234
+ "origin": "input_ids",
1235
+ "dtype": "i32",
1236
+ "device": "0"
1237
+ },
1238
+ "d0_arg1_1": {
1239
+ "placements": [
1240
+ [
1241
+ 0,
1242
+ 1
1243
+ ],
1244
+ [
1245
+ 0,
1246
+ 160
1247
+ ]
1248
+ ],
1249
+ "origin": "token_type_ids",
1250
+ "dtype": "i32",
1251
+ "device": "0"
1252
+ },
1253
+ "d0_arg2_1": {
1254
+ "placements": [
1255
+ [
1256
+ 0,
1257
+ 1
1258
+ ],
1259
+ [
1260
+ 0,
1261
+ 160
1262
+ ],
1263
+ [
1264
+ 0,
1265
+ 160
1266
+ ]
1267
+ ],
1268
+ "origin": "attention_mask",
1269
+ "dtype": "bool",
1270
+ "device": "0"
1271
+ },
1272
+ "d0_arg3_1": {
1273
+ "placements": [
1274
+ [
1275
+ 0,
1276
+ 1
1277
+ ],
1278
+ [
1279
+ 0,
1280
+ 160
1281
+ ]
1282
+ ],
1283
+ "origin": "position_ids",
1284
+ "dtype": "i32",
1285
+ "device": "0"
1286
+ }
1287
+ },
1288
+ "outputs": {
1289
+ "submod_d0_c0": {
1290
+ "placements": [
1291
+ [
1292
+ 0,
1293
+ 1
1294
+ ],
1295
+ [
1296
+ 0,
1297
+ 160
1298
+ ],
1299
+ [
1300
+ 0,
1301
+ 2
1302
+ ]
1303
+ ],
1304
+ "origin": "logits",
1305
+ "dtype": "f32",
1306
+ "device": "0"
1307
+ }
1308
+ }
1309
+ }
1310
+ },
1311
+ "blobs": {
1312
+ "8a7b90c915c1cecaf381c70594e3f955": null
1313
+ },
1314
+ "param_files": {
1315
+ "0": {
1316
+ "path": "params-mlperf-bert-large-mlperf_submission-24L-W8A8KV8-allow_bfloat16_cast_with_mcp-ba480aa7f239d5bf87fdd9b369ce396c7f516f5fcecf3f40000671d6299f6f5c.safetensors",
1317
+ "format": "safetensors"
1318
+ }
1319
+ },
1320
+ "device_constraints": [],
1321
+ "version": "0.1.0"
1322
+ },
1323
+ {
1324
+ "name": "Quantized_furiosa_llm_models.bert.symbolic.mlperf_submission.BertForQuestionAnswering-kv0-b2-attn96",
1325
+ "devices": {
1326
+ "0": "npu:0:0"
1327
+ },
1328
+ "tensors": {
1329
+ "d0_arg0_1": {
1330
+ "shape": [
1331
+ 2,
1332
+ 96
1333
+ ],
1334
+ "dtype": "i32"
1335
+ },
1336
+ "d0_arg1_1": {
1337
+ "shape": [
1338
+ 2,
1339
+ 96
1340
+ ],
1341
+ "dtype": "i32"
1342
+ },
1343
+ "d0_arg2_1": {
1344
+ "shape": [
1345
+ 2,
1346
+ 96,
1347
+ 96
1348
+ ],
1349
+ "dtype": "bool"
1350
+ },
1351
+ "d0_arg3_1": {
1352
+ "shape": [
1353
+ 2,
1354
+ 96
1355
+ ],
1356
+ "dtype": "i32"
1357
+ },
1358
+ "submod_d0_c0": {
1359
+ "shape": [
1360
+ 2,
1361
+ 96,
1362
+ 2
1363
+ ],
1364
+ "dtype": "f32"
1365
+ }
1366
+ },
1367
+ "supertasks": {
1368
+ "0": {
1369
+ "kind": "input",
1370
+ "inputs": [],
1371
+ "outputs": [
1372
+ "d0_arg0_1",
1373
+ "d0_arg1_1",
1374
+ "d0_arg2_1",
1375
+ "d0_arg3_1"
1376
+ ]
1377
+ },
1378
+ "1": {
1379
+ "kind": "output",
1380
+ "inputs": [
1381
+ "submod_d0_c0"
1382
+ ],
1383
+ "outputs": []
1384
+ },
1385
+ "2": {
1386
+ "kind": "edf",
1387
+ "inputs": [
1388
+ "d0_arg2_1",
1389
+ "d0_arg0_1",
1390
+ "d0_arg1_1",
1391
+ "d0_arg3_1"
1392
+ ],
1393
+ "outputs": [
1394
+ "submod_d0_c0"
1395
+ ],
1396
+ "device": "0",
1397
+ "data": null,
1398
+ "data_blob": "97bb3cab5f2f7f5f4640c04cbf3b6ee0"
1399
+ }
1400
+ },
1401
+ "metadata": {
1402
+ "tensors": {
1403
+ "inputs": {
1404
+ "input_ids": {
1405
+ "shape": [
1406
+ 2,
1407
+ 96
1408
+ ],
1409
+ "dtype": "i32",
1410
+ "idx": 0
1411
+ },
1412
+ "token_type_ids": {
1413
+ "shape": [
1414
+ 2,
1415
+ 96
1416
+ ],
1417
+ "dtype": "i32",
1418
+ "idx": 1
1419
+ },
1420
+ "attention_mask": {
1421
+ "shape": [
1422
+ 2,
1423
+ 96,
1424
+ 96
1425
+ ],
1426
+ "dtype": "bool",
1427
+ "idx": 2
1428
+ },
1429
+ "position_ids": {
1430
+ "shape": [
1431
+ 2,
1432
+ 96
1433
+ ],
1434
+ "dtype": "i32",
1435
+ "idx": 3
1436
+ }
1437
+ },
1438
+ "outputs": {
1439
+ "logits": {
1440
+ "shape": [
1441
+ 2,
1442
+ 96,
1443
+ 2
1444
+ ],
1445
+ "dtype": "f32",
1446
+ "idx": 0
1447
+ }
1448
+ }
1449
+ },
1450
+ "tensor_slices": {
1451
+ "inputs": {
1452
+ "d0_arg0_1": {
1453
+ "placements": [
1454
+ [
1455
+ 0,
1456
+ 2
1457
+ ],
1458
+ [
1459
+ 0,
1460
+ 96
1461
+ ]
1462
+ ],
1463
+ "origin": "input_ids",
1464
+ "dtype": "i32",
1465
+ "device": "0"
1466
+ },
1467
+ "d0_arg1_1": {
1468
+ "placements": [
1469
+ [
1470
+ 0,
1471
+ 2
1472
+ ],
1473
+ [
1474
+ 0,
1475
+ 96
1476
+ ]
1477
+ ],
1478
+ "origin": "token_type_ids",
1479
+ "dtype": "i32",
1480
+ "device": "0"
1481
+ },
1482
+ "d0_arg2_1": {
1483
+ "placements": [
1484
+ [
1485
+ 0,
1486
+ 2
1487
+ ],
1488
+ [
1489
+ 0,
1490
+ 96
1491
+ ],
1492
+ [
1493
+ 0,
1494
+ 96
1495
+ ]
1496
+ ],
1497
+ "origin": "attention_mask",
1498
+ "dtype": "bool",
1499
+ "device": "0"
1500
+ },
1501
+ "d0_arg3_1": {
1502
+ "placements": [
1503
+ [
1504
+ 0,
1505
+ 2
1506
+ ],
1507
+ [
1508
+ 0,
1509
+ 96
1510
+ ]
1511
+ ],
1512
+ "origin": "position_ids",
1513
+ "dtype": "i32",
1514
+ "device": "0"
1515
+ }
1516
+ },
1517
+ "outputs": {
1518
+ "submod_d0_c0": {
1519
+ "placements": [
1520
+ [
1521
+ 0,
1522
+ 2
1523
+ ],
1524
+ [
1525
+ 0,
1526
+ 96
1527
+ ],
1528
+ [
1529
+ 0,
1530
+ 2
1531
+ ]
1532
+ ],
1533
+ "origin": "logits",
1534
+ "dtype": "f32",
1535
+ "device": "0"
1536
+ }
1537
+ }
1538
+ }
1539
+ },
1540
+ "blobs": {
1541
+ "97bb3cab5f2f7f5f4640c04cbf3b6ee0": null
1542
+ },
1543
+ "param_files": {
1544
+ "0": {
1545
+ "path": "params-mlperf-bert-large-mlperf_submission-24L-W8A8KV8-allow_bfloat16_cast_with_mcp-ba480aa7f239d5bf87fdd9b369ce396c7f516f5fcecf3f40000671d6299f6f5c.safetensors",
1546
+ "format": "safetensors"
1547
+ }
1548
+ },
1549
+ "device_constraints": [],
1550
+ "version": "0.1.0"
1551
+ }
1552
+ ],
1553
+ "pipeline_metadata_list": [
1554
+ {
1555
+ "output_logits_size": null
1556
+ },
1557
+ {
1558
+ "output_logits_size": null
1559
+ },
1560
+ {
1561
+ "output_logits_size": null
1562
+ },
1563
+ {
1564
+ "output_logits_size": null
1565
+ },
1566
+ {
1567
+ "output_logits_size": null
1568
+ },
1569
+ {
1570
+ "output_logits_size": null
1571
+ }
1572
+ ],
1573
+ "max_prompt_len": null
1574
+ },
1575
+ "speculative_model": null,
1576
+ "version": {
1577
+ "major": 2,
1578
+ "minor": 0
1579
+ },
1580
+ "prefill_chunk_size": null
1581
+ }
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "furiosa-ai/mlperf-bert-large",
3
+ "architectures": [
4
+ "BertForQuestionAnswering"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "rngd_gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 1024,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 4096,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 16,
17
+ "num_hidden_layers": 24,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.48.1",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 30522
25
+ }
eb1a559cd1f53e2ede74f1307030a1d0.edf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c953d6374b6a56db9f8e0814b91fdd6e45e7db740ec66b907e0d40eab5673586
3
+ size 1219702748
furiosa_config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "config_version": "1.0.0",
3
+ "model_id": "furiosa-ai/mlperf-bert-large",
4
+ "model_kinds": [
5
+ "ARTIFACT"
6
+ ],
7
+ "model_class": {
8
+ "module": "furiosa_llm_models.bert.symbolic.mlperf_submission",
9
+ "name": "BertForQuestionAnswering"
10
+ },
11
+ "llm_config": {
12
+ "optimization_config": {
13
+ "attention_type": "VANILLA",
14
+ "optimize_rope": false,
15
+ "optimize_packed": false,
16
+ "decompose_layernorm": false,
17
+ "optimize_furiosa": false,
18
+ "use_unsplit_packed": true,
19
+ "compact_causal_mask": false,
20
+ "use_rngd_gelu": true,
21
+ "causal_mask_free_decoding": false,
22
+ "kv_cache_sharing_across_beams": false,
23
+ "inbound_beamsearch_softmax": false,
24
+ "calculate_logit_only_for_last_token": false,
25
+ "optimized_for_speculative_decoding": false
26
+ },
27
+ "quantization_config": {
28
+ "weight": "int8",
29
+ "activation": "int8",
30
+ "kv_cache": "int8",
31
+ "use_mcp": true
32
+ }
33
+ },
34
+ "components_versions": {
35
+ "furiosa_llm": {
36
+ "version": "0.1.0-dev",
37
+ "git_hash": "249c6f1",
38
+ "build_time": null
39
+ },
40
+ "furiosa_ir": {
41
+ "version": "0.11.0-dev",
42
+ "git_hash": "c5be5877b",
43
+ "build_time": "2025-04-23T22:34:58Z"
44
+ },
45
+ "furiosa_runtime": {
46
+ "version": "2025.2.0",
47
+ "git_hash": "3ba9de71e",
48
+ "build_time": "2025-04-23T22:39:37Z"
49
+ },
50
+ "furiosa_model_compressor": {
51
+ "version": "2025.2.0 (rev: e4565f6)",
52
+ "git_hash": null,
53
+ "build_time": null
54
+ }
55
+ }
56
+ }
params-mlperf-bert-large-mlperf_submission-24L-W8A8KV8-allow_bfloat16_cast_with_mcp-ba480aa7f239d5bf87fdd9b369ce396c7f516f5fcecf3f40000671d6299f6f5c.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ca766ce22770cdec44e351b9250d1af405e4aa1b6861fe8e93cf7067e17a00c
3
+ size 369541408
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 1000000000000000019884624838656,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff