Does it possible to create a version without MTP layer to save some VRAM
#1
by
						
adonishong
	
							
						- opened
							
					
Appreciate for your work, does it possible to create a version without MTP layer to save some VRAM as described in title?
I think vLLM breaks with quantized MTP layer currently so it would break compatibility?