Refactor FSM idx tracking during batch generation #449

benlipkin · 2023-12-19T19:59:25Z

As discussed in #433, here is a proposal for removing the idx argument from the FSM interface, as well as the reset method, at the cost of adding a copy method. An FSM is initialized, then copied for each prompt during a SequenceGenerator.__call__() on a new batch. This solves the max_tokens issue, and cleans up state tracking for CFGFSM. The cost is overhead associated with building a new FSM object for each sequence in a batch. There are ways to reduce this cost by improving the new copy method, but wanted to get thoughts on the overall interface first.

rlouf · 2023-12-20T09:21:19Z

outlines/fsm/fsm.py

-    def reset(self) -> None:
-        """Reset the FSM to its initial state. Here this only resets the token counter."""
-        self.num_tokens_generated = 0
+    def copy(self) -> "StopAtTokenFSM":


Is the copy mechanism only necessary because we enforce max_token here? Same question for the other FSM. #451 is removing this mechanism from the FSMs.

Yes, correct! If max_tokens is handled elsewhere, as suggested in #451, then the copy method for both StopAtTokenFSM and RegexFSM can just be return self, which removes all overhead. This is a great suggestion

Great! This comforts me in the idea that max_tokens should not be handled at the FSM level :)

rlouf · 2023-12-20T09:54:59Z

I rebased on main and added a small commit to format docstrings, don't forget to pull locally :)

rlouf · 2023-12-22T20:46:25Z

I think it makes sense to wait until #451 is merged so we can simplify the copy method as well.

tscholak · 2023-12-25T16:15:58Z

Does this mean we don't need to pass the seq idx as the first argument in #481?

rlouf · 2023-12-26T11:18:16Z

We pass the seq_idx to the vLLM logits processor to keep track of which FSM state corresponds to which sequence.

benlipkin · 2024-01-05T22:25:05Z

Should be up to date now post #451 merge. I changed the copy method to return self as discussed for the (now) stateless FSMs.

rlouf · 2024-01-06T21:23:48Z

outlines/generate/api.py

        num_sequences = len(prompts)
+        fsms = [self.fsm.copy() for _ in prompts]


Do we still need an array of FSMs now that they’re stateless?

Yes, because of CFGFSM (which needs to track incremental parser state for each). That still needs to deepcopy new objects, but the others (RegexFSM and StopAtFSM) are stateless and can just return pointers to original object.

rlouf · 2024-01-06T21:26:48Z

Great! I’ll review shortly. Could you rebase the branch on main? I see a lot of commits that are not related to this PR

…ng has been generated

…erator instead of when initializing it

…equenceGenerator

benlipkin · 2024-01-06T22:31:19Z

Great! I’ll review shortly. Could you rebase the branch on main? I see a lot of commits that are not related to this PR

Sorry about that. Just rebased now, but had made a bit of a mess before when I pulled main into this branch since a lot had happened in between. If this is still too messy, I can open a new PR on a new branch and just apply the diff with main to make the history cleaner. Let me know preferences.

rlouf · 2024-01-07T07:59:20Z

If this is still too messy, I can open a new PR on a new branch and just apply the diff with main to make the history cleaner. Let me know preferences.

Yes, I am sorry for the extra work but this would be preferable. In the future, use rebase instead of merge when you are working on a branch!

benlipkin · 2024-01-07T16:36:19Z

Moved and updated to #510

rlouf reviewed Dec 20, 2023

View reviewed changes

rlouf force-pushed the refactor-fsm-idx-tracking branch from dd47100 to f206c5e Compare December 20, 2023 09:54

rlouf reviewed Jan 6, 2024

View reviewed changes

benlipkin and others added 21 commits January 6, 2024 16:34

refactor idx tracking from fsm state to sequence generation

09cc3d7

can now remove reset method from fsm interface

9c7a53f

Format some docstrings

9cb89a0

Simplify the README and update docuemntation

0db8b9e

Create community section

5d25f19

Change documentation style

9b28f3c

Update Getting started

667bc9a

Improve getting started section

1625360

Expand the documentation section

425f44d

Use image instead of code snippet on front page

f5d752b

Format pip install outlines in front page

b1c05ef

Fix Quickstart examples

8e772d5

Document the vLLM integration

7dbfa55

Restore the parameter allowing to stop the sequence once a given stri…

0739eb6

…ng has been generated

Handle max_tokens in the Generator rather than in the FSM

46a3a5e

Pass max_tokens and stop_at as arguments when calling the SequenceGen…

e4c5714

…erator instead of when initializing it

Handle the max_tokens and stop_at arguments in the stream method of S…

7096623

…equenceGenerator

Rename and document new methods

213beab

Update vllm patch to v0.2.6

07dfe90

Add regex support to vLLM endpoint

3e528dd

Update the documentation

16e75ed

rlouf and others added 6 commits January 6, 2024 17:05

Do not import transformers at top level

99dcdce

refactor idx tracking from fsm state to sequence generation

2be3997

can now remove reset method from fsm interface

c0e1d0b

tidy up after rebase

68720f7

clean up formatting

1564788

clean up rebase

0c79c3e

benlipkin force-pushed the refactor-fsm-idx-tracking branch from a887385 to 0c79c3e Compare January 6, 2024 22:25

benlipkin mentioned this pull request Jan 7, 2024

Remove idx arg from FSM interface #510

Merged

benlipkin closed this Jan 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor FSM idx tracking during batch generation #449

Refactor FSM idx tracking during batch generation #449

benlipkin commented Dec 19, 2023 •

edited

Loading

rlouf Dec 20, 2023

benlipkin Dec 20, 2023

rlouf Dec 20, 2023

rlouf commented Dec 20, 2023

rlouf commented Dec 22, 2023

tscholak commented Dec 25, 2023

rlouf commented Dec 26, 2023

benlipkin commented Jan 5, 2024

rlouf Jan 6, 2024

benlipkin Jan 6, 2024

rlouf commented Jan 6, 2024 •

edited

Loading

benlipkin commented Jan 6, 2024

rlouf commented Jan 7, 2024

benlipkin commented Jan 7, 2024

		num_sequences = len(prompts)
		fsms = [self.fsm.copy() for _ in prompts]

Refactor FSM idx tracking during batch generation #449

Refactor FSM idx tracking during batch generation #449

Conversation

benlipkin commented Dec 19, 2023 • edited Loading

rlouf Dec 20, 2023

Choose a reason for hiding this comment

benlipkin Dec 20, 2023

Choose a reason for hiding this comment

rlouf Dec 20, 2023

Choose a reason for hiding this comment

rlouf commented Dec 20, 2023

rlouf commented Dec 22, 2023

tscholak commented Dec 25, 2023

rlouf commented Dec 26, 2023

benlipkin commented Jan 5, 2024

rlouf Jan 6, 2024

Choose a reason for hiding this comment

benlipkin Jan 6, 2024

Choose a reason for hiding this comment

rlouf commented Jan 6, 2024 • edited Loading

benlipkin commented Jan 6, 2024

rlouf commented Jan 7, 2024

benlipkin commented Jan 7, 2024

benlipkin commented Dec 19, 2023 •

edited

Loading

rlouf commented Jan 6, 2024 •

edited

Loading