Text Generation
MLX
Safetensors
mixtral
all use cases
creative
creative writing
all genres
tool calls
tool use
llama 3.1
llama-3
llama3
llama-3.1
problem solving
deep thinking
reasoning
deep reasoning
story
writing
fiction
roleplaying
bfloat16
role play
sillytavern
backyard
context 128k
mergekit
Merge
Mixture of Experts
mixture of experts
conversational
8-bit precision
| library_name: mlx | |
| language: | |
| - en | |
| - fr | |
| - de | |
| - es | |
| - pt | |
| - it | |
| - ja | |
| - ko | |
| - ru | |
| - zh | |
| - ar | |
| - fa | |
| - id | |
| - ms | |
| - ne | |
| - pl | |
| - ro | |
| - sr | |
| - sv | |
| - tr | |
| - uk | |
| - vi | |
| - hi | |
| - bn | |
| license: apache-2.0 | |
| tags: | |
| - all use cases | |
| - creative | |
| - creative writing | |
| - all genres | |
| - tool calls | |
| - tool use | |
| - llama 3.1 | |
| - llama-3 | |
| - llama3 | |
| - llama-3.1 | |
| - problem solving | |
| - deep thinking | |
| - reasoning | |
| - deep reasoning | |
| - story | |
| - writing | |
| - fiction | |
| - roleplaying | |
| - bfloat16 | |
| - role play | |
| - sillytavern | |
| - backyard | |
| - context 128k | |
| - mergekit | |
| - merge | |
| - moe | |
| - mixture of experts | |
| - mlx | |
| pipeline_tag: text-generation | |
| base_model: DavidAU/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B | |
| # Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-q8-mlx | |
| This model [Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-q8-mlx](https://huggingface.co/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-q8-mlx) was | |
| converted to MLX format from [DavidAU/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B](https://huggingface.co/DavidAU/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B) | |
| using mlx-lm version **0.26.0**. | |
| ## Use with mlx | |
| ```bash | |
| pip install mlx-lm | |
| ``` | |
| ```python | |
| from mlx_lm import load, generate | |
| model, tokenizer = load("Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-q8-mlx") | |
| prompt = "hello" | |
| if tokenizer.chat_template is not None: | |
| messages = [{"role": "user", "content": prompt}] | |
| prompt = tokenizer.apply_chat_template( | |
| messages, add_generation_prompt=True | |
| ) | |
| response = generate(model, tokenizer, prompt=prompt, verbose=True) | |
| ``` | |