qualcomm
/

Llama-v3.2-3B-Instruct

@@ -31,8 +31,8 @@ More details on model performance across various devices, can be found
 - **Model Type:** Model_use_case.text_generation
 - **Model Stats:**
   - Input sequence length for Prompt Processor: 128
-  - Context length: 4096
-  - Precision: w4a16 + w8a16 (few layers)
   - Num of key-value heads: 8
   - Model-1 (Prompt Processor): Llama-PromptProcessor-Quantized
   - Prompt processor input: 128 tokens + position embeddings + attention mask + KV cache inputs
@@ -52,6 +52,7 @@ More details on model performance across various devices, can be found
 | Llama-v3.2-3B-Instruct | w4a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | GENIE | 18.4176 | 0.12593600000000002 - 4.029952000000001 | -- | -- |
 | Llama-v3.2-3B-Instruct | w4a16 | SA8255P ADP | Qualcomm® SA8255P | GENIE | 14.02377 | 0.187414 - 5.997256999999999 | -- | -- |
 | Llama-v3.2-3B-Instruct | w4 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | GENIE | 13.83 | 0.088195 - 2.82225 | -- | -- |
 ## Deploying Llama 3.2 3B on-device

 - **Model Type:** Model_use_case.text_generation
 - **Model Stats:**
   - Input sequence length for Prompt Processor: 128
+  - Maximum context length: 4096
+  - Precision: w4 + w8 (few layers) with fp16 activations and w4a16 + w8a16 (few layers) are supported
   - Num of key-value heads: 8
   - Model-1 (Prompt Processor): Llama-PromptProcessor-Quantized
   - Prompt processor input: 128 tokens + position embeddings + attention mask + KV cache inputs
 | Llama-v3.2-3B-Instruct | w4a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | GENIE | 18.4176 | 0.12593600000000002 - 4.029952000000001 | -- | -- |
 | Llama-v3.2-3B-Instruct | w4a16 | SA8255P ADP | Qualcomm® SA8255P | GENIE | 14.02377 | 0.187414 - 5.997256999999999 | -- | -- |
 | Llama-v3.2-3B-Instruct | w4 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | GENIE | 13.83 | 0.088195 - 2.82225 | -- | -- |
+| Llama-v3.2-3B-Instruct | w4 | SA8295P ADP | Qualcomm® SA8295P | GENIE | 3.523 | 0.37311700000000003 - 2.9849360000000003 | -- | -- |
 ## Deploying Llama 3.2 3B on-device