Could we get more w2a16 w3a16 and w4a16 Autoround

#1
by twhitworth - opened

please please show the world the power of autoround

Intel org

1 We are a very small team. In addition to quantizing models, we are responsible for algorithm and engineering development. If you have the hardware resources, you can perform quantization yourself; we’re happy to assist if you encounter any issues.

2 We plan to release more 2- and 3-bit models once they are fully supported within serving frameworks, as we currently lack efficient kernels especially for intel devices. For now, several Q2_K_S models are available and can run on both Intel CPUs and CUDA GPUs.

Sign up or log in to comment