-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(feat) Add trajectory replay for headless mode #6215
Merged
+188
−42
Merged
Changes from 8 commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
917aba8
Rename trajectory save config
li-boxuan f31549d
Doc for replay_trajectory_path
li-boxuan 113f88f
Implement event replay
li-boxuan bd272be
Fix bugs
li-boxuan cc11095
Merge remote-tracking branch 'upstream/main' into boxuanli/trajectory…
li-boxuan 0dd31bf
Merge remote-tracking branch 'upstream/main' into boxuanli/trajectory…
li-boxuan 1a2da26
Refactor
li-boxuan 79aff3e
Merge remote-tracking branch 'upstream/main' into boxuanli/trajectory…
li-boxuan 74705a7
Merge remote-tracking branch 'upstream/main' into boxuanli/trajectory…
li-boxuan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
from openhands.core.logger import openhands_logger as logger | ||
from openhands.events.action.action import Action | ||
from openhands.events.event import Event, EventSource | ||
|
||
|
||
class ReplayManager: | ||
"""ReplayManager manages the lifecycle of a replay session of a given trajectory. | ||
|
||
Replay manager keeps track of a list of events, replays actions, and ignore | ||
messages and observations. It could lead to unexpected or even errorneous | ||
results if any action is non-deterministic, or if the initial state before | ||
the replay session is different from the initial state of the trajectory. | ||
""" | ||
|
||
def __init__(self, replay_events: list[Event] | None): | ||
if replay_events: | ||
logger.info(f'Replay logs loaded, events length = {len(replay_events)}') | ||
self.replay_events = replay_events | ||
self.replay_mode = bool(replay_events) | ||
self.replay_index = 0 | ||
|
||
def _replayable(self) -> bool: | ||
return ( | ||
self.replay_events is not None | ||
and self.replay_index < len(self.replay_events) | ||
and isinstance(self.replay_events[self.replay_index], Action) | ||
and self.replay_events[self.replay_index].source != EventSource.USER | ||
) | ||
|
||
def should_replay(self) -> bool: | ||
""" | ||
Whether the controller is in trajectory replay mode, and the replay | ||
hasn't finished. Note: after the replay is finished, the user and | ||
the agent could continue to message/act. | ||
|
||
This method also moves "replay_index" to the next action, if applicable. | ||
""" | ||
if not self.replay_mode: | ||
return False | ||
|
||
assert self.replay_events is not None | ||
while self.replay_index < len(self.replay_events) and not self._replayable(): | ||
self.replay_index += 1 | ||
|
||
return self._replayable() | ||
|
||
def step(self) -> Action: | ||
assert self.replay_events is not None | ||
event = self.replay_events[self.replay_index] | ||
assert isinstance(event, Action) | ||
enyst marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self.replay_index += 1 | ||
return event |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see! This has to work just fine... I do wonder though, what could break it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bet delegation won't work; I am not even sure if trajectory properly records delegation.
yeah that is just not possible with headless mode, but it would be interesting when we enable this in GUI mode as well... I'd love to have that functionality working, so that people can just upload trajectory to replay some recorded events first and then start working with agents.Oh actually headless mode does allow interactive inputs, that would be interesting to test out as the next step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did think of it here! Though I didn't test this use case after the last changes (and for a long time really).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested in headless mode and it just worked!
Steps 0-5 were from replay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it continues normally. But it can't replay the new one, together with that user message? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mean, at this end of the run in your image, the event stream has all events, those first 5 steps plus 3 other steps. The agent history has them too. If we save trajectory now,
traj2.json
should contain all of them, including the user message in the middle. Can we replay thistraj2.json
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💭 no, because
"wait_for_response": true
from agent message would trigger aAWAITING_USER_INPUT
state.A workaround is to manually fix the trajectory and change
AWAITING_USER_INPUT
to false. And that works!A hack in the code is to somehow not
AWAITING_USER_INPUT
if there's a next action to replay. (Not implemented)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I wouldn't hack this. Maybe a next iteration of this feature, when we accept user messages, could take care of it because the user message should (?) change the agent state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it needs some design in the next iteration. I am thinking that the agent state space should be a subset under replay mode.
AWAITING_USER_INPUT
is not a valid state during replay; it's only valid after a replay (or equivalently, without replay).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense too!