[Question] Best uncensoring approach for MoE models like Qwen3.5-35B-A3B?

#14

by Fenrenaf446 - opened about 4 hours ago

I'm exploring uncensored variants of the Qwen3.5-35B-A3B and noticed it uses a MoE (Mixture-of-Experts) architecture with sparse activation (~3B active params out of 35B total). Since traditional uncensoring techniques like abliteration are typically designed for dense models, I'm curious: what approach works best for MoE architectures? Does the sparse routing mechanism require specialized methods (e.g., router-aware fine-tuning) compared to standard dense models?

huihui-ai

Owner about 3 hours ago

Only o_proj, out_proj, and down_proj have been modified; you can compare them accordingly.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment