Edit Models filters

Apps

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

365

Full-text search

Active filters: 4bit

BoltMonkey/boltmonkey_shortreasoning-8b-Q5_K_M-GGUF

Text Generation • 8B • Updated Apr 18 • 8

TheCluster/Comet_12B_V.4-mlx-4bit

Image-Text-to-Text • Updated Apr 23 • 2

TechyCode/tinyllama-sciq-lora

Text Generation • Updated Apr 23

TheCluster/Amoral-Fallen-Omega-Gemma3-12B-mlx-4bit

Image-Text-to-Text • Updated Apr 23 • 80 • 2

Sumo10/Phi-4-mini-instruct-AWQ-4bit

4B • Updated Apr 25 • 26 • 1

Sumo10/Llama-3.2-3B-Instruct-AWQ-4bit

3B • Updated Apr 25 • 2

cyberandy/SEOcrate-4B_grpo_new_01

Text Generation • 4B • Updated May 8 • 9 • 6

Chun121/qwen3-4B-rpg-roleplay

Text Generation • 4B • Updated Jul 7 • 748 • 14

taetae030/fin-term-model

5B • Updated May 4 • 1 • 1

SujitShelar/llama3-medchat-8b-lora

Question Answering • Updated May 5

boods/mistral-location-extractor-4bit

Text Generation • 7B • Updated Sep 29 • 10

mradermacher/SEOcrate-4B_grpo_new_01-GGUF

Reinforcement Learning • 4B • Updated Jul 11 • 138 • 1

mradermacher/SEOcrate-4B_grpo_new_01-i1-GGUF

Reinforcement Learning • 4B • Updated Jul 11 • 876

vannishh/llama3-2.1B-4bit-finetuned

Programmer-RD-AI/ResearchQwen-2.5-3B-LoRA

Question Answering • 3B • Updated May 26 • 4

CodCodingCode/DeepSeek-V2-medical

Text Generation • Updated May 18

tripolskypetr/Plutus-Meta-Llama-3.1-8B-Instruct-bnb-4bit

Text Generation • 8B • Updated May 21 • 48

abdou-u/MNLP_M2_quantized_model

Text Generation • 0.4B • Updated May 19 • 2

HagalazAI/CyberDolphin-2.9.3-mistral-nemo-12b

Text Generation • 12B • Updated May 22 • 69 • 1

HagalazAI/CyberDolphin-2.9.3-mistral-nemo-12b-GGUF

Text Generation • 12B • Updated May 23 • 154 • 2

Jimmi42/sarvam-m-4bit-mlx

Text Generation • 4B • Updated May 26 • 4 • 1

geninhu/RakutenAI-7B-instruct-GPTQ

Updated May 30 • 2

umangshikarvar/sentiment-qlora-gptneo

Text Classification • Updated Jun 2 • 15

Fulstac/deepseek-r1-Distill-Qwen-32B-sqlgen-4bit-v1

Text Generation • 33B • Updated Jun 6

Fulstac/deepseek-r1-Distill-Qwen-32B-lora-4bit-v3

Text Generation • 33B • Updated Jun 6

acauanrr/qlora-ti-2025-adapter

Updated Jun 7 • 11

abdou-u/MNLP_M3_quantized_dpo_mcqa_model

Multiple Choice • 0.4B • Updated Jun 8 • 3

kevin510/friday-4bit

Text Generation • 2B • Updated Sep 23 • 6

ayureasehealthcare/ayurezeastraai

Text Generation • Updated Jun 13

Renugadevi82/cisco-nx-ai-4bit

1B • Updated Jun 16 • 11