--- license: apache-2.0 --- Checkpoint from step=500 and trained on the [easy prompt set](https://huggingface.co/datasets/RLHFlow/reinforce_ada_easy_prompt).