asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-eagle3-spec2 Reinforcement Learning • Updated 19 days ago
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-eagle3-spec4 Reinforcement Learning • Updated 19 days ago
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-ngram-spec4 Reinforcement Learning • Updated 19 days ago
asatheesh/deepmath-qwen3-4b-instruct-rloo-lora-eagle3-spec5 Reinforcement Learning • Updated 19 days ago