Qwen
/

Text Generation
Transformers
Safetensors
qwen3_next
conversational

4 bit with bitsandbytes not working

#12
by TheBigBlockPC - opened

when trying to quantize this model with bitsandbytes to "nf4" no quantisation gets applied, the model throws a out of memory error after loading 29% of the model and all 48 GB of my vram are filled up. is that a issue with the qwen MoE or is it due to multi GPU

Sign up or log in to comment