[Question] Best uncensoring approach for MoE models like Qwen3.5-35B-A3B?
#14
by Fenrenaf446 - opened
I'm exploring uncensored variants of the Qwen3.5-35B-A3B and noticed it uses a MoE (Mixture-of-Experts) architecture with sparse activation (~3B active params out of 35B total). Since traditional uncensoring techniques like abliteration are typically designed for dense models, I'm curious: what approach works best for MoE architectures? Does the sparse routing mechanism require specialized methods (e.g., router-aware fine-tuning) compared to standard dense models?
Only o_proj, out_proj, and down_proj have been modified; you can compare them accordingly.