Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Tinker-Stack
/
Nemotron-3-Nano-30B-A3B-IQ4_XS-GGUF
like
0
Text Generation
GGUF
nemotron
nvidia
Mixture of Experts
mixture-of-experts
mamba
tool-calling
reasoning
llama-cpp
ollama
11gb-vram
rtx-2080-ti
turing
imatrix
conversational
License:
nvidia-open-model-license
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
Nemotron-3-Nano-30B-A3B-IQ4_XS-GGUF
Ctrl+K
Ctrl+K
1 contributor
History:
6 commits
Tinker-Stack
Update: Flash Attention benchmarks (+35% sustained), 26.7 tok/s production config
189d56e
verified
about 2 months ago
.gitattributes
Safe
1.6 kB
Upload nvidia_Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf with huggingface_hub
about 2 months ago
Modelfile
Safe
195 Bytes
Upload Modelfile with huggingface_hub
about 2 months ago
README.md
8.39 kB
Update: Flash Attention benchmarks (+35% sustained), 26.7 tok/s production config
about 2 months ago
nvidia_Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf
18.1 GB
xet
Upload nvidia_Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf with huggingface_hub
about 2 months ago