Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

375

Full-text search

Active filters: torchao

gurro/llama-3.1-8B-torchao-int4wo-128

Text Generation • Updated Dec 2, 2024 • 6

gurro/llama-3.1-8B-torchao-int4wo-256

Text Generation • Updated Dec 2, 2024 • 9

jerryzh168/llama3-8b-autoquant

Text Generation • Updated Feb 19, 2025 • 29

medmekk/Llama-3.1-8B-Instruct-torchao-int8_weight_only

Updated Jan 8, 2025 • 4

medmekk/Llama-3.1-8B-Instruct-torchao-int8wo

Updated Jan 8, 2025 • 4

medmekk/Llama-3.1-8B-Instruct-torchao-int8da8w

Updated Jan 8, 2025 • 2

medmekk/Llama-3.2-3B-Instruct-torchao-int8wo

Updated Jan 8, 2025 • 3

medmekk/Llama-3.2-1B-torchao-int8wo

Updated Jan 8, 2025 • 4

medmekk/Llama-3.2-1B-torchao-int8da8w

Updated Jan 8, 2025 • 5

medmekk/Llama-3.2-3B-Instruct-torchao-int8da8w

Updated Jan 8, 2025 • 3

medmekk/Llama-3.1-70B-Instruct-torchao-int8da8w

Updated Jan 8, 2025 • 1

jerryzh168/Meta-Llama-3-8B-torchao-int8_weight_only

Updated Jan 13, 2025 • 3

jerryzh168/Meta-Llama-3-8B-torchao-int4_weight_only-gs_128

Updated Jan 13, 2025 • 2

jerryzh168/Meta-Llama-3-8B-torchao-int4_weight_only-gs_64

Updated Jan 13, 2025 • 5

HF-Quantization/Llama-3.2-1B-TORCHAO-W8

Updated Jan 21, 2025 • 5

HF-Quantization/Llama-3.2-1B-TORCHAO-W8A8

Updated Jan 21, 2025 • 2

HF-Quantization/Llama-3.2-1B-TORCHAO-W4

Updated Jan 21, 2025 • 4

HF-Quantization/Llama-3.3-70B-Instruct-TORCHAO-W4

Updated Jan 22, 2025 • 2

jpablomch/Meta-Llama-3-8B-Instruct-torchao

Text Generation • Updated Feb 19, 2025 • 9

jerryzh168/llama3-8b-int4wo-128

Text Generation • Updated Feb 21, 2025 • 12

jerryzh168/llama3-8b-int8wo

Text Generation • Updated Feb 27, 2025 • 6

alpindale/Meta-Llama-3-8B-torchao-int8_weight_only

Updated Mar 2, 2025 • 29

drisspg/f8a8-opt-125m

Text Generation • Updated Mar 4, 2025 • 7

drisspg/f8a8-opt-125m_2

Text Generation • Updated Mar 5, 2025 • 10

drisspg/float8_dynamic_act_float8_weight-opt-125m

Text Generation • Updated Mar 19, 2025 • 6

marksaroufim/Meta-Llama-3-8B-torchao-int8_weight_only

Updated Mar 20, 2025 • 9

jerryzh168/llama3-int8wo

Text Generation • Updated Mar 20, 2025 • 7

jerryzh168/llama3-int4wo

Text Generation • Updated Mar 21, 2025 • 6

jerryzh168/gemma3-8da4w

Any-to-Any • Updated Mar 25, 2025 • 7

jerryzh168/gemma3-4b-it-float8dq

Any-to-Any • Updated Mar 26, 2025 • 7