Text Generation
MLX
Safetensors
mixtral
all use cases
creative
creative writing
all genres
tool calls
tool use
llama 3.1
llama-3
llama3
llama-3.1
problem solving
deep thinking
reasoning
deep reasoning
story
writing
fiction
roleplaying
bfloat16
role play
sillytavern
backyard
context 128k
mergekit
Merge
Mixture of Experts
mixture of experts
conversational
8-bit precision
metadata
library_name: mlx
language:
- en
- fr
- de
- es
- pt
- it
- ja
- ko
- ru
- zh
- ar
- fa
- id
- ms
- ne
- pl
- ro
- sr
- sv
- tr
- uk
- vi
- hi
- bn
license: apache-2.0
tags:
- all use cases
- creative
- creative writing
- all genres
- tool calls
- tool use
- llama 3.1
- llama-3
- llama3
- llama-3.1
- problem solving
- deep thinking
- reasoning
- deep reasoning
- story
- writing
- fiction
- roleplaying
- bfloat16
- role play
- sillytavern
- backyard
- context 128k
- mergekit
- merge
- moe
- mixture of experts
- mlx
pipeline_tag: text-generation
base_model: DavidAU/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B
Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-q8-mlx
This model Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-q8-mlx was converted to MLX format from DavidAU/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B using mlx-lm version 0.26.0.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-q8-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)