Gemma 3n does not seem to work in the sample application for web

by sernanic - opened May 20

May 20

•

Hey guys,

I am encountering a problem. When I try to run gemma 3n in the sample application for web (https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/llm_inference/js) it does not seem to work. I get this error( Failed to initialize the task: Array buffer allocation failed). I tried running gemma2-2b-it-gpu-int8.bin and that worked perfectly. So far the only change I did to the files was changing the filename for the model

machine: Apple M3 pro
browser: chrome

sernanic changed discussion title from Gemma 3n does not seem to work in the sample application to Gemma 3n does not seem to work in the sample application for web May 20

Athrael

May 21

•

edited May 21

Hey guys,

I am encountering a problem. When I try to run gemma 3n in the sample application for web (https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/llm_inference/js) it does not seem to work. I get this error( Failed to initialize the task: Array buffer allocation failed). I tried running gemma2-2b-it-gpu-int8.bin and that worked perfectly. So far the only change I did to the files was changing the filename for the model

machine: Apple M3 pro
browser: chrome

This is possibly because the model is too big to be loaded onto the browser. Try gemma3-1b-it-int4.task and it will work. The other one you tried is also quantized @8bit , so it barely made it :)

bababababooey

May 22

•

edited May 22

litert-community/Gemma3-12B-IT runs just fine in the sample app so model size is not the issue here

it seems to be a bug with @mediapipe/tasks-genai loading the 3n models (or just the zip format loading). The 3n .task files are actually a zip of TFL3 format files+metadata:

(base) dev@Mac-mini gemma-3n-E2B-it-int4.task % ls
METADATA			TF_LITE_VISION_ADAPTER
TF_LITE_EMBEDDER		TF_LITE_VISION_ENCODER
TF_LITE_PER_LAYER_EMBEDDER	TOKENIZER_MODEL
TF_LITE_PREFILL_DECODE

unlike the litert-community/Gemma3-12B-IT task file which is a single TFL3 format file

sd17js2

May 23

Same issue here

reyjosias

May 25

Do not work with mediapipe do not load

bababababooey

May 27

litert-community/Gemma3-12B-IT runs just fine in the sample app so model size is not the issue here

it seems to be a bug with @mediapipe/tasks-genai loading the 3n models (or just the zip format loading). The 3n .task files are actually a zip of TFL3 format files+metadata:
(base) dev@Mac-mini gemma-3n-E2B-it-int4.task % ls
METADATA			TF_LITE_VISION_ADAPTER
TF_LITE_EMBEDDER		TF_LITE_VISION_ENCODER
TF_LITE_PER_LAYER_EMBEDDER	TOKENIZER_MODEL
TF_LITE_PREFILL_DECODE
unlike the litert-community/Gemma3-12B-IT task file which is a single TFL3 format file

so we're stuck for now because

@mediapipe/tasks-genai loading of 3n is bugged
tflite doesn't support int4 quantization

@google please

tylermullen

Google org Jun 10

The 3n preview is not supported on web just yet. Web LLM inference presently supports:

all text-only Gemma 3 variants
MedGemma-27B
Gemma 2 2B
the older architectures it initially launched with (Phi 2, Falcon 1B, Stable LM 3B, Gemma 1 2B & 7B)

tylermullen

Google org Sep 20

The full multimodal Gemma 3n is now supported on web with MediaPipe Web LLM Inference API. Additional demos to follow shortly, but model links and usage instructions are already available here.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment