VLLM error
#2
by
mlinmg
- opened
Using the latest version of vllm 0.6.1.post2 throws Unsupported base layer: QKVParallelLinear(in_features=8192, output_features=10240, bias=False, tp_size=1, gather_output=False) during init
https://docs.vllm.ai/en/latest/features/quantization/
vLLM doesn't support this quantization method - AQLM
It has been deprecated some time ago, yes.
If you really want to use it, you can downgrade vLLM.
I used transformers framework for inference since vLLM is new to me.