Spaces:

discord-community
/

README

Running

How to run inference with a model loaded on both CPU and GPU with device_map="balanced"

by sadimanna - opened Sep 23

Hugging Face Discord Community org Sep 23

This comment has been hidden (marked as Off-Topic)

sadimanna changed discussion title from How to run inference with a model loaded with device_map="balanced" to How to run inference with a model loaded with device_map="balanced" on CPU and GPU Sep 23

sadimanna changed discussion title from How to run inference with a model loaded with device_map="balanced" on CPU and GPU to How to run inference with a model loaded on both CPU and GPU with device_map="balanced" Sep 23

Parveshiiii

Hugging Face Discord Community org Sep 23

the error comes from a device mismatch. Some parts of your pipeline are on the GPU (cuda:0), while others (like the text_encoder) are still on the CPU. When tensors from different devices interact (e.g., matrix multiplication between CPU and GPU tensors), PyTorch throws this error.

You used device_map="balanced". That tells accelerate/diffusers to automatically spread submodules across CPU and GPU to fit memory.

In your case:

python
pipeline.hf_device_map
{'text_encoder': 'cpu', 'vae': 0}
→ The text encoder is on CPU, while the VAE is on GPU.

During inference, the text encoder outputs CPU tensors, but the rest of the pipeline expects GPU tensors → mismatch.

see i would suggest you to do device_map = "auto" it will smartly offload some part of model to cpu its the easiest fix there are other fixes but they are complicated let me know if you want to know

sadimanna

Hugging Face Discord Community org Sep 23

@Parveshiiii

Using

device_map = "auto"

gave me error.

Parveshiiii

Hugging Face Discord Community org Sep 23

what was the error?

Parveshiiii

Hugging Face Discord Community org Sep 23

what is the hardware you are using?

sadimanna

Hugging Face Discord Community org Sep 23

•

edited Sep 24

@Parveshiiii , I am trying to run it on Google Colab with a T4 GPU, and it looks like

device_map="auto"

is not supported in the diffusers library.

I am getting the error message

NotImplementedError: auto not supported. Supported strategies are: balanced, cuda

In diffusers/src/diffusers/pipelines/pipeline_utils.py

 SUPPORTED_DEVICE_MAP = ["balanced"] + [get_device()]

sadimanna

Hugging Face Discord Community org Sep 24

•

edited Sep 24

@Parveshiiii

I created a new post here https://huggingface.co/spaces/discord-community/README/discussions/10#68d39e820be4322549b88a72

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment