You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Qwen3.5-122B-A10B-abliterated-GGUF

GGUF quantized versions of wangzhang/Qwen3.5-122B-A10B-abliterated, a decensored Qwen/Qwen3.5-122B-A10B created using Abliterix.

Available Quantizations

File	Quantization	Size	Quality	Use Case
Q4_K_M	Q4_K_M	70 GB	High	1x 80GB GPU or CPU with 96GB+ RAM
Q8_0	Q8_0	121 GB	Very High	2x 80GB GPUs or CPU with 160GB+ RAM

About the Source Model

95% refusal reduction: Reduced from 100/100 refusals to just 5/100 (5%)
MoE architecture: Qwen3.5-122B-A10B activates only ~10B parameters per token — 14B-class speed with 122B-class knowledge
Minimal capability loss: KL divergence of just 0.0878 from the original model
50-trial Optuna TPE optimization: Automated Bayesian hyperparameter search with multi-objective Pareto optimization
Orthogonalized abliteration: Surgical removal of refusal directions without degrading general intelligence

See wangzhang/Qwen3.5-122B-A10B-abliterated for full details on the abliteration process and steering parameters.

Usage

llama.cpp

# Q4_K_M (recommended for single GPU)
./llama-cli -m Qwen3.5-122B-A10B-abliterated-Q4_K_M.gguf -p "Your prompt here" -n 512

# Q8_0 (higher quality)
./llama-cli -m Qwen3.5-122B-A10B-abliterated-Q8_0.gguf -p "Your prompt here" -n 512

llama-server

./llama-server -m Qwen3.5-122B-A10B-abliterated-Q4_K_M.gguf --host 0.0.0.0 --port 8080

Ollama

# Create a Modelfile
echo "FROM ./Qwen3.5-122B-A10B-abliterated-Q4_K_M.gguf" > Modelfile
ollama create qwen3.5-122b-abliterated -f Modelfile
ollama run qwen3.5-122b-abliterated

VRAM / RAM Requirements

Quantization	Full GPU Offload	Partial Offload (32 layers)	CPU Only
Q4_K_M	~74 GB (1x 80GB)	~40 GB GPU + 40 GB RAM	~80 GB RAM
Q8_0	~130 GB (2x 80GB)	~70 GB GPU + 70 GB RAM	~140 GB RAM

Disclaimer

This model is provided for research purposes only. The creator is not responsible for any misuse.

Credits

Base model: Qwen/Qwen3.5-122B-A10B by Alibaba Qwen team
Abliteration: wangzhang/Qwen3.5-122B-A10B-abliterated
Abliteration framework: Abliterix
Quantization: llama.cpp

Downloads last month: 136

GGUF

Model size

122B params

Architecture

qwen35moe

Hardware compatibility

4-bit

8-bit

Model tree for wangzhang/Qwen3.5-122B-A10B-abliterated-GGUF

Base model

Qwen/Qwen3.5-122B-A10B

Finetuned

wangzhang/Qwen3.5-122B-A10B-abliterated

Quantized

(6)

this model