stanfordnlp
/

SteamSHP-flan-t5-xl

text2text-generation

preference model

text-generation-inference

Model card Files Files and versions

kawine commited on Oct 10, 2023

Commit

dfae895

·

1 Parent(s): 944b054

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -21,6 +21,8 @@ tags:
 <!-- Provide a quick summary of what the model is/does. -->
 SteamSHP-XL is a preference model trained to predict -- given some context and two possible responses -- which response humans will find more helpful.
 It can be used for NLG evaluation or as a reward model for RLHF.

 <!-- Provide a quick summary of what the model is/does. -->
+**If you mention this dataset in a paper, please cite the paper:** [Understanding Dataset Difficulty with V-Usable Information (ICML 2022)](https://proceedings.mlr.press/v162/ethayarajh22a.html).
 SteamSHP-XL is a preference model trained to predict -- given some context and two possible responses -- which response humans will find more helpful.
 It can be used for NLG evaluation or as a reward model for RLHF.