Could we get more w2a16 w3a16 and w4a16 Autoround

by twhitworth - opened 25 days ago

Discussion

twhitworth

25 days ago

please please show the world the power of autoround

wenhuach

Intel org 22 days ago

1 We are a very small team. In addition to quantizing models, we are responsible for algorithm and engineering development. If you have the hardware resources, you can perform quantization yourself; we’re happy to assist if you encounter any issues.

2 We plan to release more 2- and 3-bit models once they are fully supported within serving frameworks, as we currently lack efficient kernels especially for intel devices. For now, several Q2_K_S models are available and can run on both Intel CPUs and CUDA GPUs.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment