We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ { "instruction": "人类指令(必填)", "input": "人类输入(选填)", "chosen": "优质回答(必填)", "rejected": "劣质回答(必填)" } ]
--
这里的chosen和rejected可以为list格式吗?我应该怎么构造数据呢
No response
The text was updated successfully, but these errors were encountered:
仅支持一个,多个请分成多条数据
Sorry, something went wrong.
@hiyouga 还想请教一下,我有多个response分别对应质量打分,比如对于按照质量打分score排序 response a>b>c>d,以下哪种方式推荐? 方式一:按照score,取相对概念,比如b相比于a更差,b相比于c更好。 [ { "instruction": prompt, "chosen": a, "rejected": b }, { "instruction": prompt, "chosen": b, "rejected": c },...] 方式二:设定阈值,比如a>b>thresh>c>d,那么a、b为chosen,c、d为rejected [ { "instruction": prompt, "chosen": a, "rejected": c }, { "instruction": prompt, "chosen": a, "rejected": d },... ]
No branches or pull requests
Reminder
System Info
[
{
"instruction": "人类指令(必填)",
"input": "人类输入(选填)",
"chosen": "优质回答(必填)",
"rejected": "劣质回答(必填)"
}
]
Reproduction
--
Expected behavior
这里的chosen和rejected可以为list格式吗?我应该怎么构造数据呢
Others
No response
The text was updated successfully, but these errors were encountered: