No output_router_logits / load_balancing_loss_func for Qwen3VLMoE?
#10
by
plcedoz38
- opened
Qwen3VLMoE does not have output_router_logits and router_aux_loss_coef options (in model config and in forward pass) to output router logits and control the weight of the load_balancing_loss_func of mixture experts. This is usually the case for MoE in HF, like Qwen2MoE (https://huggingface.co/docs/transformers/en/model_doc/qwen2_moe).
router_logits are part of _can_record_outputs in Qwen3VLMoE but not sure how to use those?
Thank you!!