THUNDER-AI-GGUF
THUNDER-AI-GGUF is a GGUF release of the THUNDER AI model for local inference.
Available model file
THUNDER-AI-R1 V1.2 1.5B.Q4_K_M.gguf
Ollama usage
Run the raw model directly from Hugging Face:
ollama run hf.co/EREN121232/THUNDER-AI-GGUF:Q4_K_M
Included helper files
Modelfile.thunder-clean- Builds a cleaned Ollama wrapper model that avoids leaking
<think>...</think>tags. - Also sets
num_ctx 8192for a larger working context.
- Builds a cleaned Ollama wrapper model that avoids leaking
ollama_memory_proxy.py- Optional local proxy for Ollama-compatible clients.
- Adds lightweight conversation memory by saving useful facts/preferences from user messages and injecting relevant memories into future prompts.
Build the cleaned Ollama model
ollama create thunder-ai-clean -f Modelfile.thunder-clean
Optional memory proxy usage
The memory proxy is meant for local setups where an app talks to Ollama through an HTTP endpoint.
Set these environment variables if you want to customize it:
THUNDER_REAL_OLLAMA_BASE_URLTHUNDER_PROXY_HOSTTHUNDER_PROXY_PORTTHUNDER_MEMORY_FILETHUNDER_MEMORY_MAXTHUNDER_MEMORY_INJECT_MAX
Then run:
python ollama_memory_proxy.py
By default it listens on 127.0.0.1:11435 and forwards requests to Ollama on 127.0.0.1:11434.
Notes
- This repo is for local GGUF usage.
- Machine-specific launcher scripts were intentionally not included in the repo because they depend on local Windows paths and drive layout.
- The model was fine-tuned and exported with Unsloth.
- Downloads last month
- 304
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support