Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor FSM idx tracking during batch generation #449

Closed
wants to merge 27 commits into from

Conversation

benlipkin
Copy link
Contributor

@benlipkin benlipkin commented Dec 19, 2023

As discussed in #433, here is a proposal for removing the idx argument from the FSM interface, as well as the reset method, at the cost of adding a copy method. An FSM is initialized, then copied for each prompt during a SequenceGenerator.__call__() on a new batch. This solves the max_tokens issue, and cleans up state tracking for CFGFSM. The cost is overhead associated with building a new FSM object for each sequence in a batch. There are ways to reduce this cost by improving the new copy method, but wanted to get thoughts on the overall interface first.

def reset(self) -> None:
"""Reset the FSM to its initial state. Here this only resets the token counter."""
self.num_tokens_generated = 0
def copy(self) -> "StopAtTokenFSM":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the copy mechanism only necessary because we enforce max_token here? Same question for the other FSM. #451 is removing this mechanism from the FSMs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct! If max_tokens is handled elsewhere, as suggested in #451, then the copy method for both StopAtTokenFSM and RegexFSM can just be return self, which removes all overhead. This is a great suggestion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! This comforts me in the idea that max_tokens should not be handled at the FSM level :)

@rlouf rlouf force-pushed the refactor-fsm-idx-tracking branch from dd47100 to f206c5e Compare December 20, 2023 09:54
@rlouf
Copy link
Member

rlouf commented Dec 20, 2023

I rebased on main and added a small commit to format docstrings, don't forget to pull locally :)

@rlouf
Copy link
Member

rlouf commented Dec 22, 2023

I think it makes sense to wait until #451 is merged so we can simplify the copy method as well.

@tscholak
Copy link
Contributor

Does this mean we don't need to pass the seq idx as the first argument in #481?

@rlouf
Copy link
Member

rlouf commented Dec 26, 2023

We pass the seq_idx to the vLLM logits processor to keep track of which FSM state corresponds to which sequence.

@benlipkin
Copy link
Contributor Author

Should be up to date now post #451 merge. I changed the copy method to return self as discussed for the (now) stateless FSMs.

num_sequences = len(prompts)
fsms = [self.fsm.copy() for _ in prompts]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need an array of FSMs now that they’re stateless?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because of CFGFSM (which needs to track incremental parser state for each). That still needs to deepcopy new objects, but the others (RegexFSM and StopAtFSM) are stateless and can just return pointers to original object.

@rlouf
Copy link
Member

rlouf commented Jan 6, 2024

Great! I’ll review shortly. Could you rebase the branch on main? I see a lot of commits that are not related to this PR

@benlipkin benlipkin force-pushed the refactor-fsm-idx-tracking branch from a887385 to 0c79c3e Compare January 6, 2024 22:25
@benlipkin
Copy link
Contributor Author

Great! I’ll review shortly. Could you rebase the branch on main? I see a lot of commits that are not related to this PR

Sorry about that. Just rebased now, but had made a bit of a mess before when I pulled main into this branch since a lot had happened in between. If this is still too messy, I can open a new PR on a new branch and just apply the diff with main to make the history cleaner. Let me know preferences.

@rlouf
Copy link
Member

rlouf commented Jan 7, 2024

If this is still too messy, I can open a new PR on a new branch and just apply the diff with main to make the history cleaner. Let me know preferences.

Yes, I am sorry for the extra work but this would be preferable. In the future, use rebase instead of merge when you are working on a branch!

@benlipkin
Copy link
Contributor Author

Moved and updated to #510

@benlipkin benlipkin closed this Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants