Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Chaew00n
/
test-policy-optimization-query-rewrite-llama3B
like
0
Transformers
Safetensors
Generated from Trainer
trl
grpo
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
test-policy-optimization-query-rewrite-llama3B
/
adapter_model.safetensors
Commit History
Training in progress, step 5000
272f136
verified
Chaew00n
commited on
Jun 9
Training in progress, step 4000
89fcb90
verified
Chaew00n
commited on
Jun 9
Training in progress, step 3000
76658a0
verified
Chaew00n
commited on
Jun 9
Training in progress, step 2000
0cd3736
verified
Chaew00n
commited on
Jun 9
Training in progress, step 1000
92a3e2d
verified
Chaew00n
commited on
Jun 9