No output_router_logits / load_balancing_loss_func for Qwen3VLMoE?
Qwen3VLMoE does not have output_router_logits and router_aux_loss_coef options (in model config and in forward pass) to output router logits and control the weight of the load_balancing_loss_func of mixture experts. This is usually the case for MoE in HF, like Qwen2MoE (https://huggingface.co/docs/transformers/en/model_doc/qwen2_moe).
router_logits are part of _can_record_outputs in Qwen3VLMoE but not sure how to use those?
Thank you!!
I would try to add them in transformers there days and update here once it’s done.
@plcedoz38
@hamza-hcompany
Hi, it's added in this pr: https://github.com/huggingface/transformers/pull/41277. Now you can access router_logits with extra args output_router_logits in the model forward!
Thank you very much!