Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPT4 prompt when evaluating DPO #88

Open
kygguo opened this issue Sep 5, 2024 · 0 comments
Open

GPT4 prompt when evaluating DPO #88

kygguo opened this issue Sep 5, 2024 · 0 comments

Comments

@kygguo
Copy link

kygguo commented Sep 5, 2024

Thanks for sharing the amazing repo!

The GPT-4 win rate prompt stated in the paper is attached below. As HH dataset concerns both helpful and harmless, I wonder why only helpful is considered when evaluating models, is there any special consideration regarding this?

Dialogue GPT-4 win rate prompt.
For the following query to a chatbot, which response is more helpful?
Query: <the user query>
Response A:
<either the test method or baseline>
Response B:
<the other response>
FIRST provide a one-sentence comparison of the two responses and explain \
which you feel is more helpful. SECOND, on a new line, state only "A" or \
"B" to indicate which response is more helpful. Your response should use \
the format:
Comparison: <one-sentence comparison and explanation>
More helpful: <"A" or "B">
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant