rlsamplingJF/posttraining_sentence_Qwen2.5-7B-Instruct-finemath-rm-run1-lr1e-6-constant-bs8-gc10.0-step84 7B • Updated 12 days ago • 10
rlsamplingJF/posttraining_sentence_Qwen2.5-7B-Instruct-finemath-rm-run1-lr1e-6-constant-bs8-gc10.0-step36 7B • Updated 12 days ago • 108
rlsamplingJF/posttraining_sentence_Qwen2.5-7B-Instruct-finemath-rm-run1-lr1e-6-constant-bs8-gc10.0-initial 7B • Updated 29 days ago • 70
rlsamplingJF/posttraining_sentence_Qwen2.5-7B-Instruct-finemath-rm-run1-lr1e-6-constant-bs8-gc10.0-step96 7B • Updated Sep 24 • 20
rlsamplingJF/cpt_rm_training_8BT_filtered_Qwen2.5-3B-finemath-rm-run4-lr1e-4-cosine-bs32-gc1.0-step15 3B • Updated Sep 22 • 2
rlsamplingJF/cpt_rm_training_8BT_filtered_llama-3.2-3b-finemath-rm-run5-lr3e-5-cosine-bs32-gc1.0-step30 3B • Updated Sep 22 • 2