Edit Models filters

Apps

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

374

Full-text search

Active filters: 4bit

legraphista/Llama-3.2-3B-Instruct-IMat-GGUF

Text Generation • 3B • Updated Sep 25, 2024 • 737 • 1

Narrator5000/llavanext-finetuned-stackoverflow-vqa

Updated Sep 29, 2024 • 5 • 1

NeoChen1024/internlm2_5-20b-chat-exl2-4.25bpw-h8

Text Generation • Updated Sep 30, 2024

ussipan/SipanGPT-0.1-Llama-3.2-1B-GGUF

Text Generation • 1B • Updated Oct 19, 2024 • 62 • 1

ussipan/SipanGPT-0.2-Llama-3.2-1B-GGUF

Text Generation • 1B • Updated Oct 19, 2024 • 121

mcavus/glm-4v-9b-gptq-4bit-dynamo

3B • Updated Oct 10, 2024 • 2 • 1

ussipan/SipanGPT-0.3-Llama-3.2-1B-GGUF

Text Generation • 1B • Updated Dec 23, 2024 • 70 • 1

harishnair04/Gemma-medtr-2b-sft

Text Generation • 2B • Updated Nov 7, 2024 • 2

harishnair04/Gemma-medtr-2b-sft-v2

Text Generation • 3B • Updated Nov 15, 2024 • 4

mradermacher/Gemma-medtr-2b-sft-v2-GGUF

3B • Updated Nov 16, 2024 • 116

NaomiBTW/L3-8B-Lunaris-v1-GPTQ

Text Generation • Updated Nov 11, 2024

ModelCloud/Qwen2.5-Coder-32B-Instruct-gptqmodel-4bit-vortex-v1

Text Generation • 7B • Updated Nov 14, 2024 • 56 • 16

Rakushaking/llm-jp-3-13b-it

Updated Aug 24 • 4

ModelCloud/QwQ-32B-Preview-gptqmodel-4bit-vortex-v1

Text Generation • 7B • Updated Dec 18, 2024 • 41 • 51

nisten/qwen2.5-coder-7b-abliterated-128k-AWQ

Text Generation • 2B • Updated Jan 7 • 4

ModelCloud/QwQ-32B-Preview-gptqmodel-4bit-vortex-v2

Text Generation • 7B • Updated Dec 18, 2024 • 28 • 16

ModelCloud/QwQ-32B-Preview-gptqmodel-4bit-vortex-v3

Text Generation • 7B • Updated Dec 20, 2024 • 15 • 14

mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-4bit

Text Generation • 1B • Updated Dec 20, 2024 • 5

ModelCloud/Falcon3-10B-Instruct-gptqmodel-4bit-vortex-v1

Text Generation • 2B • Updated Dec 21, 2024 • 13 • 3

adriabama06/SmallThinker-3B-Preview-AWQ

Text Generation • Updated Jan 3 • 5 • 1

exxocism/Linkbricks-Horizon-AI-Llama-3.3-Korean-70B-sft-dpo-GGUF

Text Generation • Updated Jan 7

ehristoforu/Phi4-MoE-2x14B-Instruct

Text Generation • 14B • Updated Jan 9 • 6

ModelCloud/Qwen2.5-0.5B-Instruct-gptqmodel-w4a16

Text Generation • 0.5B • Updated Oct 19 • 20 • 1

ModelCloud/DeepSeek-R1-Distill-Qwen-7B-gptqmodel-4bit-vortex-v1

Text Generation • 2B • Updated Jan 24 • 14 • 5

ModelCloud/DeepSeek-R1-Distill-Qwen-7B-gptqmodel-4bit-vortex-v2

Text Generation • 2B • Updated Jan 24 • 115 • 7

vital-ai/watt-tool-70B-awq

11B • Updated Jan 24 • 3 • 4

curiousmind147/microsoft-phi-4-AWQ-4bit-GEMM

Text Generation • 3B • Updated Feb 4 • 234 • 1

ConfidentialMind/Mistral-Small-24B-Instruct-2501_GPTQ_G128_W4A16_MSE

Text Classification • 4B • Updated Feb 18 • 29 • 1

ConfidentialMind/Virtuoso-Medium-v2_GPTQ_G128_W4A16

Text Generation • 6B • Updated Feb 16 • 3

ConfidentialMind/Virtuoso-Medium-v2_GPTQ_G32_W4A16

Text Generation • 7B • Updated Feb 16 • 15