This is the official checkpoint of feedback model trained using COFFEE-GYM with PPO strategy.

This model generates natural language feedback given an erroneous code.

For further detials, please see our paper.

https://huggingface.co/spaces/Coffee-Gym/Project-Coffee-Gym

Downloads last month
4
Safetensors
Model size
7B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Team-Coffee-Gym/DS-Coder-7B-PPO-CoffeeEval

Quantizations
1 model

Spaces using Team-Coffee-Gym/DS-Coder-7B-PPO-CoffeeEval 2