I struggle deploying inference endpoints. I tried to start a few:
TheBloke/Wizard-Vicuna-30B-Uncensored-GPTQs3nh/lmsys-vicuna-7b-v1.5-16k-GGMLPygmalionAI/pygmalion-6bTheBloke/WizardLM-7B-uncensored-GPTQ
and all of them fail spectacularly. And I do not really know where to start with debugging and search for errors.
I always used the recommended/automatic model suggestion, and if no where suggested I used an accelerated instance because all of those models are either Text Completion or Conversational.
Common errors I encountered where justError: ShardCannotStartApplication startup failed. Exiting.- And also a very general error which I cannot rephradse currently (already deleted the endpoints)
Maybe I am too blind to see the elephant in the room, or it is camouflaging itself very well.
Pretty thanks in advance and have a great day
Greetings qbin ![]()