4 bit with bitsandbytes not working
#12
by
TheBigBlockPC
- opened
when trying to quantize this model with bitsandbytes to "nf4" no quantisation gets applied, the model throws a out of memory error after loading 29% of the model and all 48 GB of my vram are filled up. is that a issue with the qwen MoE or is it due to multi GPU