V9 V10 need more VRAM, V5 can be run with 8G VRAM, but V9 V10 can't

#111
by richuyouyaojie - opened

V9 V10 need more VRAM, V5 can be run with 8G VRAM, but V9 V10 can't

I also encountered the same problem. The 8G VRAM always out of memory. Are there any suggestions for improvement?

I use the gguf version

Oddly enough, v9 and v10 file sizes are actually smaller! Same parameter count and precision. I wouldn't expect issues with VRAM but who knows sometimes with this stuff (and varying ComfyUI versions).

Oddly enough, v9 and v10 file sizes are actually smaller! Same parameter count and precision. I wouldn't expect issues with VRAM but who knows sometimes with this stuff (and varying ComfyUI versions).

I've had similar VRAM issues but not sure why. Is it easy for you to post just the "recipe" for each version so we can "DIY" the same basic combination without the massive downloads? I know this is built-in to the file, but it'd be awesome to have access to that without the massive download. I think this may make it easier to diagnose the VRAM issue. Thanks!

--disable-smart-memory. some comfyui .68 issue.

the recipe in in safetensors file if you download, scroll down copypaste workflow and deepseek will make a json out of it again, if you are kind and ask for.

Hi everyone, I also have 8GB VRAM on an RTX 3070, 32GB RAM, and a Ryzen 5800X.
First of all, I want to sincerely thank the author Phr00t for his amazing dedication — I hope you're getting some sleep and taking care of your health 😄
Back to the topic: I’m using the checkpoint loader from the multigpu node by polockjj, which includes an option for CPU offloading.
Attached is the workflow showing my current setup. I’m also sending the launcher configuration — the first arguments are interesting: I’m not using the --lowvram flag (which could help prevent crashes), and I’m also not using --disabledsmartmemory.
I haven’t modified anything, so sorry for the mess.
Some of the environment variables are useful — like lazy loading. There are so many of them, so if you’re curious about any, just ask one of the AIs 🙂 That’s the easiest and fastest way.
On top of that, I’m using a program designed via Grok and Claude that applies pressure to RAM once a certain threshold is reached, forcing faster offloading to the M2 SSD swap. It works pretty well.
I had to tweak some Windows and registry settings — I admit I did that 3 or 4 months ago, so I don’t remember the details.
But that’s just the cherry on top — it runs perfectly fine even without any extra software. I think it basically just does what simple memory cleaners do 😄
The first run is obviously slow because it’s loading everything, but after that it flies — even when changing prompts or resolution.
Sometimes, after the 3rd or 5th run, it reloads the entire model again, which takes a bit longer — but still, it’s great.
About v10 — it’s one of the best.
As many people have said, 5.3 (even the NSFW version) was my go-to for everything. But after yesterday’s work, I can say that v10 is the new 5.3.
It holds the prompt well, doesn’t generate unnecessary extras, and that infamous blocky texture is gone. And if it does show up, it’s easy to fix with an upscaler or similar tools.
Sorry for the long feedback — I hope it helps someone.
Also, I’ve noticed that when editing images with strong colors — for example, I had an animated logo of a brown bear on an orange background — the processing added a lot of saturation.
I added nodes from the original Comfy workflow (modelsamplingauraflow and CFGnorm).
Auraflow isn’t really relevant here, but I left it in case I want to experiment later.
When the colors got blown out, setting CFGNorm to a value between 0.96 and 0.98 brought the colors back to normal.
Hope this helps someone.
Sorry — I’m translating this with Copilot 😄 I understand English, but if I had to write it myself, it wouldn’t go well.
Attached is the workflow, node preview, and launcher.

workflow
Multi_gpu
Screenshot 2025-11-09 094054
Flags_Variables
Flags_Variables_part2

Edit:... Attached launcher preview uses SAGE attention (I use it for all other work except Qwen Edit — with that, it throws a black screen no matter which model I try). For Qwen Edit/Image, I simply comment out SAGE and use Flash or xFormers instead. Also, the regular KSampler works faster than Clownshark, but I feel like Clown — even though it takes a bit longer — sometimes handles certain things more precisely.

deepseek will make a json out of it again, if you are kind and ask for.

Why?

Oddly enough, v9 and v10 file sizes are actually smaller! Same parameter count and precision. I wouldn't expect issues with VRAM but who knows sometimes with this stuff (and varying ComfyUI versions).

I've had similar VRAM issues but not sure why. Is it easy for you to post just the "recipe" for each version so we can "DIY" the same basic combination without the massive downloads? I know this is built-in to the file, but it'd be awesome to have access to that without the massive download. I think this may make it easier to diagnose the VRAM issue. Thanks!

You can click on a safetensors here on Huggingface and it will show you the workflow JSON for it.

I have 8 GB VRAM too. I had the same problem, but I solved the problem by using the clip of old version of Qwen-Image-Edit-Rapid-AIO which was normal.

@Phr00t then i guess its related to the new clip converter node in creation workflow https://github.com/Shiba-2-shiba/ComfyUI_DiffusionModel_fp8_converter

I have 8 GB VRAM too. I had the same problem, but I solved the problem by using the clip of old version of Qwen-Image-Edit-Rapid-AIO which was normal.

I am on 8 gigs vram. Same issue. version 7.1 ran but 10.4 giving me out of memory. I don't fully understand what you did to fix it? I am using the Qwen-Rapid-AIO.json worfklow listed in the files section on this listing. Can any1 explain like im 5 for how to fix the clip part?

Sign up or log in to comment