Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Verifier in the generation loop #416

Open
RyanMarten opened this issue Jan 28, 2025 · 4 comments
Open

Request: Verifier in the generation loop #416

RyanMarten opened this issue Jan 28, 2025 · 4 comments
Labels
enhancement New feature or request

Comments

@RyanMarten
Copy link
Contributor

RyanMarten commented Jan 28, 2025

This is something that was asked for previously with Judge (and also structured output adherence).

I feel like this is a prefect time to add (for rejection sampling with verifier)

Why is this helpful.
Because if you don't, you have to wait until the end of the run to verify, filter, and start a new job
And you could immediately know when you got the request back (+ verify time) and sent that request concurrently
This means a speed up for repeated tries. (You don't have to wait for the straggler calls. Note on that - stragglers are due to the number of tokens generated which is non-deterministic and changes when you do repeated sampling).

Rejected samples can written to a separate dataset.arrow file (so we just don't throw away like structured output failing).

Viewing these rejected samples in the curator-viewer would be very useful.

We can also report rejection rates on the CLI and in the viewer.

@RyanMarten
Copy link
Contributor Author

Image

Illustrated

@RyanMarten
Copy link
Contributor Author

@shreyaspimpalgaonkar says that verifier as an abstraction doesn't really make sense since it is very different for math and code (code verification is a heavy lift)

This is much more intensive and has much more overhead than the simple parsing check we do right now.

Also this is related to what @kartik4949 was suggesting in terms of letting individual requests pass through curator calls without blocking for all of them to finish

@RyanMarten
Copy link
Contributor Author

@vutrung96 says that async architecture here will add complexity and since this is an optimization not a new feature we should de-prioritize

@RyanMarten RyanMarten added the enhancement New feature or request label Jan 28, 2025
@RyanMarten
Copy link
Contributor Author

Agreed we should just measure how bad this is. Let's do the dumb this first. The only reason I'm mentioning this optimization now is because with reasoning models, response generation time is long and exacerbates this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant