FP8/4bit version please

by zhanghx0905 - opened 29 days ago

Discussion

zhanghx0905

29 days ago

FP8/4bit version please

29 days ago

autoround gptq the better

mtcl

29 days ago

are there any instructions on how to run this locally with 5X5090 ?

ryanstout

27 days ago

@mtcl I've got 7x 5090's, let me know if you figure out how to. Seems like 4-bit quantization should work for me, but still getting OOM's for some reason.

mtcl

27 days ago

What command are you using to run it? What software are you using it?

ryanstout

27 days ago

@mtcl Trying to run through transformers. Tried loading in 4-bit with both the transformers load_in_4bit and using bitsandbytes. How about you?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment