Inference Providers
Active filters: quark
superbigtree/Mistral-Nemo-Instruct-2407-FP8_aq
12B • Updated • 2
aigdat/Llama-3.2-1B-Instruct-awq-uint4-float16
0.4B • Updated • 2
aigdat/Llama-3.2-3B-Instruct-awq-uint4-float16
0.8B • Updated aigdat/Phi-3.5-mini-instruct-awq-uint4-float16
0.6B • Updated aigdat/DeepSeek-R1-Distill-Qwen-1.5B_quantized_int4_bfloat16
0.4B • Updated aigdat/Qwen3-0.6B_quantized_int4_float16
0.2B • Updated • 4
aigdat/Arch-Function-Chat-3B_quantized_int4_float16
0.7B • Updated • 1
aigdat/DeepCoder-14B-Preview_quantized_int4_float16
3B • Updated • 1
aigdat/Qwen2.5-Coder-1.5B-Instruct_quantized_int4_bfloat16
0.4B • Updated aigdat/Qwen2.5-Coder-7B-Instruct_quantized_int4_bfloat16
aigdat/Qwen2.5-3B-Instruct_quantized_int4_bfloat16
0.7B • Updated • 10
aigdat/Qwen2.5-Coder-32B-Instruct_quantized_int4_bfloat16
5B • Updated • 2
aigdat/Llama-xLAM-2-8b-fc-r_quantized_int4_bfloat16
fxmarty/qwen_1.5-moe-a2.7b-mxfp4
8B • Updated • 4.03k
amd/Llama-3.3-70B-Instruct-MXFP4-Preview
38B • Updated • 4.11k
• 2
fxmarty/deepseek_r1_3_layers_mxfp4
8B • Updated • 441
• 1
fxmarty/Llama-4-Scout-17B-16E-Instruct-2-layers-mxfp4
5B • Updated • 2.73k
• 1
371B • Updated • 33.3k
• 5
mohitsha/Llama-2-7b-hf-w_mx_fp4_per_group_sym
4B • Updated amd/Llama-3.1-405B-Instruct-MXFP4-Preview
218B • Updated • 482
• 1
amd/DeepSeek-R1-MXFP4-ASQ
363B • Updated • 19
• 1
haoyang-amd/qwen1.5-0.5B-ptpc
0.5B • Updated amd/DeepSeek-R1-0528-MXFP4
356B • Updated • 10.1k
• 1
fxmarty/Llama-3.1-70B-Instruct-2-layers-mxfp6
3B • Updated • 2.83k
fxmarty/qwen1.5_moe_a2.7b_chat_w_fp4_a_fp6_e2m3
8B • Updated • 3.81k
fxmarty/qwen1.5_moe_a2.7b_chat_w_fp6_e2m3_a_fp6_e2m3
11B • Updated • 1
fxmarty/qwen1.5_moe_a2.7b_chat_w_fp6_e3m2_a_fp6_e3m2
11B • Updated • 4.15k
amd/Llama-2-70b-chat-hf-WMXFP4-AMXFP4-KVFP8-Scale-UINT8-MLPerf-GPTQ
37B • Updated • 2
sudhab1988/rakuten-7b-awq-g128-int4-asym-fp16-hf
1B • Updated matmelis/Llama_3.2_1B_w_uint4_gptq
0.4B • Updated