No output_router_logits / load_balancing_loss_func for Qwen3VLMoE?

#10
by plcedoz38 - opened

Qwen3VLMoE does not have output_router_logits and router_aux_loss_coef options (in model config and in forward pass) to output router logits and control the weight of the load_balancing_loss_func of mixture experts. This is usually the case for MoE in HF, like Qwen2MoE (https://huggingface.co/docs/transformers/en/model_doc/qwen2_moe).

router_logits are part of _can_record_outputs in Qwen3VLMoE but not sure how to use those?

Thank you!!

Sign up or log in to comment