-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor FSM idx tracking during batch generation #449
Conversation
def reset(self) -> None: | ||
"""Reset the FSM to its initial state. Here this only resets the token counter.""" | ||
self.num_tokens_generated = 0 | ||
def copy(self) -> "StopAtTokenFSM": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the copy mechanism only necessary because we enforce max_token
here? Same question for the other FSM. #451 is removing this mechanism from the FSMs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, correct! If max_tokens
is handled elsewhere, as suggested in #451, then the copy
method for both StopAtTokenFSM
and RegexFSM
can just be return self
, which removes all overhead. This is a great suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! This comforts me in the idea that max_tokens
should not be handled at the FSM level :)
dd47100
to
f206c5e
Compare
I rebased on main and added a small commit to format docstrings, don't forget to pull locally :) |
I think it makes sense to wait until #451 is merged so we can simplify the |
Does this mean we don't need to pass the seq idx as the first argument in #481? |
We pass the |
Should be up to date now post #451 merge. I changed the |
num_sequences = len(prompts) | ||
fsms = [self.fsm.copy() for _ in prompts] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need an array of FSMs now that they’re stateless?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because of CFGFSM
(which needs to track incremental parser state for each). That still needs to deepcopy new objects, but the others (RegexFSM
and StopAtFSM
) are stateless and can just return pointers to original object.
Great! I’ll review shortly. Could you rebase the branch on |
…ng has been generated
…erator instead of when initializing it
a887385
to
0c79c3e
Compare
Sorry about that. Just rebased now, but had made a bit of a mess before when I pulled main into this branch since a lot had happened in between. If this is still too messy, I can open a new PR on a new branch and just apply the diff with main to make the history cleaner. Let me know preferences. |
Yes, I am sorry for the extra work but this would be preferable. In the future, use |
Moved and updated to #510 |
As discussed in #433, here is a proposal for removing the
idx
argument from theFSM
interface, as well as thereset
method, at the cost of adding acopy
method. An FSM is initialized, then copied for each prompt during aSequenceGenerator.__call__()
on a new batch. This solves themax_tokens
issue, and cleans up state tracking forCFGFSM
. The cost is overhead associated with building a newFSM
object for each sequence in a batch. There are ways to reduce this cost by improving the newcopy
method, but wanted to get thoughts on the overall interface first.