OpenVINO
/

Llama-3.1-8B-Instruct-FastDraft-150M-int8-ov

Model card Files Files and versions

katuni4ka commited on Nov 21, 2024

Commit

47267f5

·

verified ·

1 Parent(s): 3eba8eb

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -42,7 +42,7 @@ You must be a registered user in 🤗 Hugging Face Hub. Please visit [HuggingFac
 carefully read terms of usage and click accept button.  You will need to use an access token for the code below to run. For more information
 on access tokens, refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens).
-```
 pip install optimum-intel[openvino]
 optimum-cli export openvino --model meta-llama/Meta-Llama-3.1-8B-Instruct --task text-generation-with-past --weight-format int8 main_model_path
@@ -50,7 +50,7 @@ optimum-cli export openvino --model meta-llama/Meta-Llama-3.1-8B-Instruct --task
 ```
 3. Download draft model from HuggingFace Hub
-```
 import huggingface_hub as hf_hub
 draft_model_id = "OpenVINO/Llama-3.1-8B-Instruct-FastDraft-150M"
@@ -59,7 +59,7 @@ draft_model_path = “draft”
 hf_hub.snapshot_download(draft_model_id, local_dir=draft_model_path)
 ```
 4. Run model inference using the speculative decoding and specify the pipeline parameters:
-```
 import openvino_genai
 prompt = “What is OpenVINO?”

 carefully read terms of usage and click accept button.  You will need to use an access token for the code below to run. For more information
 on access tokens, refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens).
+```bash
 pip install optimum-intel[openvino]
 optimum-cli export openvino --model meta-llama/Meta-Llama-3.1-8B-Instruct --task text-generation-with-past --weight-format int8 main_model_path
 ```
 3. Download draft model from HuggingFace Hub
+```python
 import huggingface_hub as hf_hub
 draft_model_id = "OpenVINO/Llama-3.1-8B-Instruct-FastDraft-150M"
 hf_hub.snapshot_download(draft_model_id, local_dir=draft_model_path)
 ```
 4. Run model inference using the speculative decoding and specify the pipeline parameters:
+```python
 import openvino_genai
 prompt = “What is OpenVINO?”