---
base_model: google-bert/bert-large-uncased
license: apache-2.0
pipeline_tag: question-answering
library_name: furiosa-llm
tags:
  - furiosa-ai
---
# Model Overview
- **Model Architecture:** Bert
  - **Input:** Text
  - **Output:** Text
- **Model Optimizations:**
- **Maximum Context Length:** 384 tokens
- **Intended Use Cases:** Intended for commercial and non-commercial use. Same as [google/bert-large-uncase](https://huggingface.co/google-bert/bert-large-uncased), this models is intended for question-answering.
- **Release Date:** 04/12/2025
- **Version:** v2025.2
- **License(s):** [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md)
- **Supported Inference Engine(s):** Furiosa LLM
- **Supported Hardware Compatibility:** FuriosaAI RNGD
- **Preferred Operating System(s):** Linux
- **Quantization:**
  - Tool: Furiosa Model Compressor v0.6.2, included in Furiosa SDK 2025.2
  - Weight: int8, Activation: int8, KV cache: int8
  - Calibration: [SQuAD v1.1 dataset](https://rajpurkar.github.io/SQuAD-explorer/) ([instruction](https://zenodo.org/records/4792496)), [100 samples](https://github.com/mlcommons/inference/blob/master/calibration/SQuAD-v1.1/bert_calibration_features.txt)


## Description:
This model is the pre-compiled version of the [google/bert-large-uncase](https://huggingface.co/google-bert/bert-large-uncased), which is an embedding model that uses an optimized transformer architecture.

## Usage

### MLPerf Benchmark using RNGD
Follow the example command below after [installing furiosa-mlperf and its prerequisites](https://developer.furiosa.ai/latest/en/getting_started/furiosa_mlperf.html).

```sh
furiosa-mlperf bert-offline furiosa-ai/bert-large-uncased-INT8-MLPerf ./mlperf-result
```