ds8b_grpo_math_gsm8k_rloo-global_step_500 / model-00007-of-00007.safetensors

Commit History

Upload ds8b_grpo_math_gsm8k_rloo at global_step_500
2e62dc0
verified

polaris-73 commited on