Skip to content

Commit

Permalink
merge hoang into ketan/coact branch
Browse files Browse the repository at this point in the history
  • Loading branch information
ketan1741 committed Sep 22, 2024
2 parents b6a6938 + 9a7c5a9 commit e5ae172
Show file tree
Hide file tree
Showing 13 changed files with 423 additions and 271 deletions.
40 changes: 32 additions & 8 deletions agenthub/coact_agent/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,36 @@
# CoAct Multi-Agent Framework

This folder implements a multi-agent workflow inspired by the CoAct framework ([paper](https://arxiv.org/abs/2406.13381)), that provides a robust structure for defining, planning, and executing tasks using multiple agents.
This folder implements a multi-agent workflow inspired by the CoAct framework ([paper](https://arxiv.org/abs/2406.13381)), that provides a robust structure for defining, planning, and executing tasks using multiple agents.

## Agents
## Agents

1. `CoActPlannerAgent`:
- is responsible for exploring and creating a global plan. It can replan if there are issues with the previous one.
- has full capabilities of [CodeActAgent](https://github.com/All-Hands-AI/OpenHands/tree/main/agenthub/codeact_agent).
2. `CoActExecutorAgent`:
- is responsible for executing the proposed plan. Facing issues with the plan, it can request for a new one.
- also has full capabilities of [CodeActAgent](https://github.com/All-Hands-AI/OpenHands/tree/main/agenthub/codeact_agent).
1. `CoActPlannerAgent`:
- is responsible for exploring and creating a global plan. It can replan if there are issues with the previous one.
- has full capabilities of [CodeActAgent](https://github.com/All-Hands-AI/OpenHands/tree/main/agenthub/codeact_agent).
2. `CoActExecutorAgent`:
- is responsible for executing the proposed plan. Facing issues with the plan, it can request for a new one.
- also has full capabilities of [CodeActAgent](https://github.com/All-Hands-AI/OpenHands/tree/main/agenthub/codeact_agent).


## Plan structure
```markdown
The user message is: <<Full user's message here.>>
# Phases
## Phase 1
- description: <<The task that needs to be done in this phase.>>
- reason: <<Assistant's thorough thoughts on why this phase is necessary, with tips/codes to instruct the executor finish the task easier.>>
- expected_state: <<Describe the expected state after this phase is completed. If the task involves code editing, provide the expectation of the code after the edit.>>
<file_path> <<The file path to edit. In one phase only 1 file is edited.>> </file_path>
<expected_content>
<<The partial expected content here WITH LINE NUMBERS and a vertical bar before the actual code e.g., 1|, 11|.>>
</expected_content>
## Phase 2
- description: ...
- reason: ...
- expected_state: ...
<file_path> ... </file_path>
<expected_content>
...|...
</expected_content>
## Phase ...
```
4 changes: 2 additions & 2 deletions agenthub/coact_agent/executor/action_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ class ExecutorResponseParser(CodeActResponseParser):
"""

def __init__(self):
# Need to pay attention to the item order in self.action_parsers
# Need pay attention to the item order in self.action_parsers
super().__init__()
self.action_parsers = [
CodeActActionParserFinish(),
Expand Down Expand Up @@ -62,7 +62,7 @@ def check_condition(self, action_str: str) -> bool:
def parse(self, action_str: str) -> Action:
assert (
self.request is not None
), 'self.global_plan should not be None when parse is called'
), 'self.request should not be None when parse is called'

replan_request = self.request.group(1).strip()
return AgentFinishAction(
Expand Down
1 change: 1 addition & 0 deletions agenthub/coact_agent/executor/executor_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,4 @@ def __init__(self, llm: LLM, config: AgentConfig) -> None:
agent_skills_docs=AgentSkillsRequirement.documentation,
micro_agent=self.micro_agent,
)
self.stop_sequences.append('</execute_request>')
29 changes: 16 additions & 13 deletions agenthub/coact_agent/executor/system_prompt.j2
Original file line number Diff line number Diff line change
Expand Up @@ -11,31 +11,34 @@ print("Hello World!")
</execute_ipython>

The agent can execute bash commands wrapped with <execute_bash>, e.g. <execute_bash> ls </execute_bash>.
The agent is not allowed to run interactive commands. For commands that may run indefinitely,
the output should be redirected to a file and the command run in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a bash command returns exit code `-1`, this means the process is not yet finished.
The assistant must then send a second <execute_bash>. The second <execute_bash> can be empty
(which will retrieve any additional logs), or it can contain text to be sent to STDIN of the running process,
or it can contain the text `ctrl+c` to interrupt the process.

If a command execution result says "Command timed out. Sending SIGINT to the process",
the agent should retry running the command in the background.
For commands that may run indefinitely, the output should be redirected to a file and the command run
in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a command execution result says "Command timed out. Sending SIGINT to the process", the assistant should retry running the command in the background.

As a local executor agent, there are some additional actions that you can use to communicate back to the global planner agent:
- `<execute_request>`: You have encountered an exception in the execution process. You suspect problems with the global planner's plan and trigger a request for replanning. Explain why you decide to request a new global plan using this action.

{% endset %}
{% set BROWSING_PREFIX %}
The assistant can browse the Internet with <execute_browse> and </execute_browse>.
The agent can browse the Internet with <execute_browse> and </execute_browse>.
For example, <execute_browse> Tell me the usa's president using google search </execute_browse>.
Or <execute_browse> Tell me what is in http://example.com </execute_browse>.
{% endset %}
{% set PIP_INSTALL_PREFIX %}
The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: <execute_ipython> %pip install [package needed] </execute_ipython> and should always import packages and define variables before starting to use them.
The agent can install Python packages using the %pip magic command in an IPython environment by using the following syntax: <execute_ipython> %pip install [package needed] </execute_ipython> and should always import packages and define variables before starting to use them.
{% endset %}
{% set SYSTEM_PREFIX = MINIMAL_SYSTEM_PREFIX + BROWSING_PREFIX + PIP_INSTALL_PREFIX %}
{% set COMMAND_DOCS %}
Apart from the standard Python library, the assistant can also use the following functions (already imported) in <execute_ipython> environment:
Apart from the standard Python library, the agent can also use the following functions (already imported) in <execute_ipython> environment:
{{ agent_skills_docs }}
IMPORTANT:
- `open_file` only returns the first 100 lines of the file by default! The assistant MUST use `scroll_down` repeatedly to read the full file BEFORE making edits!
- The assistant shall adhere to THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRING PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write the line out, with all leading spaces before the code!
- `open_file` only returns the first 100 lines of the file by default! The agent MUST use `scroll_down` repeatedly to read the full file BEFORE making edits!
- The agent shall adhere to THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRING PROPER INDENTATION. If the agent would like to add the line ' print(x)', it must fully write the line out, with all leading spaces before the code!
- Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
- Any code issued should be less than 50 lines to avoid context being cut off!
- After EVERY `create_file` the method `append_file` shall be used to write the FIRST content!
Expand All @@ -44,13 +47,13 @@ IMPORTANT:
{% endset %}
{% set SYSTEM_SUFFIX %}
Responses should be concise.
The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
If the assistant is finished with the task you MUST include <finish></finish> in your response.
The agent should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per response, unless the agent is finished with the task or needs more input or action from the user in order to proceed.
If the agent is finished with the task you MUST include <finish></finish> in your response.
IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
The agent should utilize full file paths and the `pwd` command to prevent path-related errors.
The agent must avoid apologies and thanks in its responses.
Remeber to execute ALL the phases of the global plan and only return the summary of the whole process to the global agent.
Remeber to execute ALL the phases of the global plan and only return the summary of the whole process to the global agent. Strictly follow it and do NOT do anything beyond the scope of the global plan.

{% endset %}
{# Combine all parts without newlines between them #}
Expand Down
118 changes: 87 additions & 31 deletions agenthub/coact_agent/executor/user_prompt.j2
Original file line number Diff line number Diff line change
@@ -1,39 +1,53 @@
{% set DEFAULT_EXAMPLE %}
--- START OF EXAMPLE ---

USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
--- START OF EXAMPLE 1 ---

USER: The user message is: "Create a list of numbers from 1 to 10, and display them in a web page at port 5000. There's an existing Python file `app.py` that you can use."
# Phases
## Phase 1
- description: Edit the existing Python file `app.py` to insert the code for starting a web server, at the end of the file.
- reason: The existing Python file is missing the code to start the web server. Normally, the server is started using the `app.run(port=5000)` command. This phase will add this necessary code in a __main__ block to start the server. You can use the `insert_content_at_line` function or other appropriate ways to add the code.
- expected_state: The Python file `app.py` is updated with the code to start the web server.
<file_path> /workspace/app.py </file_path>
<expected_content>
8|
9|if __name__ == '__main__':
10| app.run(port=5000)
</expected_content>
## Phase 2
- description: Run the Python file `app.py` to display the list of numbers on a web page at port 5000.
- reason: After updating the Python file, we need to run it to display the list of numbers on a web page. The command a shell command can be: `python3 app.py`.
- expected_state: The list of numbers is displayed on a web page at port 5000.

ASSISTANT:
Sure! Let me create a Python file `app.py`:
Let's start with the first phase: editing the Python file `app.py` to insert the code for starting a web server at line 8. I will open the file at that line to check its current content:
<execute_ipython>
create_file('app.py')
open_file('/workspace/app.py', 8)
</execute_ipython>

USER:
OBSERVATION:
[File: /workspace/app.py (1 lines total)]
[File: /workspace/app.py (8 lines total)]
(this is the beginning of the file)
1|
1|from flask import Flask
2|app = Flask(__name__)
3|
4|@app.route('/')
5|def index():
6| numbers = list(range(1, 11))
7| return str(numbers)
8|
(this is the end of the file)
[File app.py created.]

ASSISTANT:
Now I will write the Python code for starting a web server and save it to the file `app.py`:
As expected, the file is missing the code to start the web server. Now I will write the task from the plan and save it to the file `app.py`:
<execute_ipython>
EDITED_CODE="""from flask import Flask
app = Flask(__name__)

@app.route('/')
def index():
numbers = list(range(1, 11))
return str(numbers)

EDITED_CODE="""
if __name__ == '__main__':
app.run(port=5000)"""

insert_content_at_line(
'app.py',
1,
8,
EDITED_CODE,
)
</execute_ipython>
Expand All @@ -52,10 +66,10 @@ OBSERVATION:
9|if __name__ == '__main__':
10| app.run(port=5000)
(this is the end of the file)
[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
[File updated (edited at line 9). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]

ASSISTANT:
I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
I have updated the Python file `app.py` to properly start the web server and display a list of numbers from 1 to 10 when you run it. Let's proceed to phase 2 to run the Python file:
<execute_bash>
python3 app.py > server.log 2>&1 &
</execute_bash>
Expand Down Expand Up @@ -125,12 +139,25 @@ WARNING: This is a development server. Do not use it in a production deployment.
Press CTRL+C to quit

ASSISTANT:
The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. The tasks are now complete.
Summary:
- Edited the existing Python file `app.py` to insert the code for starting a web server.
- Ran the Python file to display the list of numbers on a web page at port 5000.
<finish></finish>

USER: Now browse the newly started server's homepage and show me the content.
--- END OF EXAMPLE 1 ---

--- START OF EXAMPLE 2 ---

USER: The user message is: "Now browse the newly started server's homepage and show me the content."
# Phases
## Phase 1
- description: Browse the server's homepage at http://127.0.0.1:5000
- reason: To verify that the server is running and displaying the list of numbers correctly, we need to browse the server's homepage. We can use the `execute_browse` function to access the URL.
- expected_state: The content of the server's homepage is retrieved.

ASSISTANT:
Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
Sure! Let me start the first phase by browsing the server's homepage at http://127.0.0.1:5000:
<execute_browse>
Get the content on "http://127.0.0.1:5000"
</execute_browse>
Expand All @@ -140,12 +167,37 @@ OBSERVATION:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

ASSISTANT:
The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". The tasks are now complete.
Summary:
- Browsed the server's homepage at http://127.0.0.1:5000 and retrieved the content "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]".
<finish></finish>

USER: Now kill the server, make it display the numbers in a table format.
--- END OF EXAMPLE 2 ---

--- START OF EXAMPLE 3 ---

USER: The user message is: "Now kill the server, make it display the numbers in a table format."
# Phases
## Phase 1
- description: Stop the server running on port 5000.
- reason: The server needs to be stopped before modifying the Python code.
- expected_state: The server is stopped.
## Phase 2
- description: Modify the Python code to display the numbers in a table format.
- reason: The current code returns the numbers as a string, which is incorrect. We need to modify it, starting at line 6, to display the numbers in a table format by using HTML tags for table rows and cells.
- expected_state: The Python code is updated to display the numbers in a table format.
<file_path> /workspace/app.py </file_path>
<expected_content>
6| numbers = list(range(1, 11))
7| return "<table>" + "".join([f"<tr><td>{num}</td></tr>" for num in numbers]) + "</table>"
</expected_content>
## Phase 3
- description: Run the Python file again to display the numbers in a table format on the web page at port 5000.
- reason: After modifying the Python code, we need to run the file again to display the numbers in the expected table format.
- expected_state: The numbers are displayed in a table format on the web page.

ASSISTANT:
Sure! Let me stop the server first:
Sure! Let me execute the first phase to stop the server first:
<execute_bash>
kill 124
</execute_bash>
Expand All @@ -155,9 +207,9 @@ OBSERVATION:
[1]+ Terminated python3 app.py > server.log 2>&1

ASSISTANT:
The server has been stopped. Let me open the Python file and modify it.
The server has been stopped. Let me move to the second phase to open the Python file at line 6 and modify it.
<execute_ipython>
open_file('app.py')
open_file('app.py', 6)
</execute_ipython>

USER:
Expand Down Expand Up @@ -203,7 +255,7 @@ Observation:
[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]

ASSISTANT:
Running the updated file:
Now, let's move the third phase: to run the updated file:
<execute_bash>
python3 app.py > server.log 2>&1 &
</execute_bash>
Expand All @@ -213,10 +265,14 @@ Observation:
[1] 126

ASSISTANT:
The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. The tasks are now complete.
Summary:
- Stopped the server running on port 5000.
- Modified the Python code to display the numbers in a table format.
- Ran the Python file again to display the numbers in a table format on the web page at port 5000.
<finish></finish>

--- END OF EXAMPLE ---
--- END OF EXAMPLE 3 ---
{% endset %}
Here is an example of how you can interact with the environment for task solving:
{{ DEFAULT_EXAMPLE }}
Expand Down
Loading

0 comments on commit e5ae172

Please sign in to comment.