Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: rewrite underlying architecture of the framework to recursive object #105

Merged
merged 22 commits into from
Dec 12, 2024

Conversation

grabbou
Copy link
Collaborator

@grabbou grabbou commented Dec 12, 2024

Fixes #68
Related #102

There are a lot of changes in this PR, I will break them down here, together with motivation.

The prime motivation for the change was to simplify the structure of "state" by getting rid of agent specific properties (such as agentName, agentRequest) and moving to a simple state object that can have "child" states (either single element, or an array).

That way, we could get rid of special handling for supervisor, resource planner (two built-in agents) and further simplify the logic.

Thanks to that, we have "out-of-the-box" support for parallelism, cancellations, hand-offs. If agent wants to delegate, simply create new workflowState and add it as a child, then return state (see how "supervisor" is implemented). If agent wants to hand-off the task, they can simply replace their entire state (see how "resourcePlanner" gets its job done).

Other changes worth mentioning:

  • Agent does not have role anymore. Now, what matters is the key on "team" object. This is aligned with how we define tools. This will make it better and more future proof if we serialize state object on the server. We no longer rely on array positions.
  • Renamed "members" array to "team" object
  • Supervisor and Planner are now normal agents (and you can overwrite them, or change their behavior, or define your own)
  • Added tool helpers to be used on the server side

Here is screenshot of what the output looks like at the moment:

3EB0FB32C3162E9A2F78_1

@grabbou grabbou requested a review from pkarw December 12, 2024 11:57
@pkarw
Copy link
Collaborator

pkarw commented Dec 12, 2024

Nice!!! I will review and test it later Today but frankly speaking - I love it! Great job @grabbou!

@grabbou
Copy link
Collaborator Author

grabbou commented Dec 12, 2024

Thanks! I am adding server-side support now, then, I will get some rest, collect feedback. We should run all the examples to make sure they do work, or perform tweaks!

@tomaszfelczyk
Copy link

tomaszfelczyk commented Dec 12, 2024

@grabbou Hi! Refactor of orchestration was my first thought. However, I wondered about breaking down the architecture into primitives: agent, task, prompt to give more flexibility for replacing dependencies.

IMHO:

  • agent should store it's system prompt as it directly describe him internally, description should describe him externally.
  • run should be outside of agent as task primitive e.g. task(agent, prompt) formatOutputTask(agent, prompt, schema)
  • task then cloud be tool of agent

are you open to PRs at this point?

@grabbou
Copy link
Collaborator Author

grabbou commented Dec 12, 2024

Hey @tomaszfelczyk, thanks for feedback! Definitely open to suggestions - ideally, please leave some comments inline in the PR (if you can), that way it's easier to discuss this.

As far as your questions, let me answer inline:

agent should store it's system prompt as it directly describe him internally, description should describe him externally.

Right now, each agent is an object and MUST define "run". This is runner that executes its operations. The default "agent()" primitive in the library uses predefined "run" that requires "description" to operate.

There are other agents, such as supervisor, that have custom runner, and they don't define description, as it would be redundant.

If you decide to build your own primitive, you can store system prompt, or "role/experience" or anything else! Hope that answers your question!

run should be outside of agent as task primitive e.g. task(agent, prompt) formatOutputTask(agent, prompt, schema)
task then cloud be tool of agent

run is already defined outside of agent and does not rely on anything but received parameters. It is associated with agent object so that we can easily call different runners, depending on what users selected at workflow creation.

task then cloud be tool of agent

I would ask you to elaborate a bit more on this one! 🙏

@pkarw pkarw merged commit aab1c36 into main Dec 12, 2024
grabbou added a commit that referenced this pull request Dec 13, 2024
We separate knowledge from workflow description to avoid
tasks/descriptions interfering with agent. This could possibly be
something else in the future, like a method that gets it from the
database/something.

This is just demonstration, as alternative to #105. I am still not sure
about this.
grabbou added a commit that referenced this pull request Dec 13, 2024
We separate knowledge from workflow description to avoid
tasks/descriptions interfering with agent. This could possibly be
something else in the future, like a method that gets it from the
database/something.

This is just demonstration, as alternative to #105. I am still not sure
about this.
grabbou added a commit that referenced this pull request Dec 13, 2024
We separate knowledge from workflow description to avoid
tasks/descriptions interfering with agent. This could possibly be
something else in the future, like a method that gets it from the
database/something.

This is just demonstration, as alternative to #105. I am still not sure
about this.
grabbou added a commit that referenced this pull request Dec 13, 2024
Before, we downloaded the `main` snapshot. It was fine. However, it
didn't work well when the current `main` source code was incompatible
with the released `npm` modules. This situation happened with the recent
#105. Now, we're getting the latest official release and taking the
examples out of it.

Because the tarball starts with a directory of an unknown name (in fact,
it's the `org-repo-lastcommitid` kind of) - we're about to extract the
full release and not only the example folder (as it was before) because
the leading directory name is unknown.

---------

Co-authored-by: Mike Grabowski <[email protected]>
@tomaszfelczyk
Copy link

tomaszfelczyk commented Dec 14, 2024

@grabbou

Right now, each agent is an object and MUST define "run". This is runner that executes its operations. The default "agent()" primitive in the library uses predefined "run" that requires "description" to operate.

After better read in, You are right, agree!

I was trying to point out something else, namely the question of whether the runner should take something more than just state. (ps.: by the way state is a very generic name, every time I read the code I replaced it with task for myself). Since state own messages - can own context in them

For testing/researching purposes it would be nice to test single agent behaviour based on single state (task). e.g.:

const state = basicTask(`Find me...`)
agent.run(state)

Since the workflow is now hardcoded (one of many other possible, more or less complex), the agent is dependent on one type of workflow.

It would be nice if workflows cloud be more flexible, also could result schema instead of string.
To achieve this delegation to different agent could be done using delegate() and awareOf() (can help understand supervisor abilities of whole workflow) tools. Than agent result (schema) can be define in task properties and it will be result of workflow.

const hierarchicalWorkflow = (team: Agent[], description: string) => {
    const resourcePlanner = resource_planner({
        tools: {
            delegate: delegateTo(team)
        }
    })
    return workflow({
        team: team,
        entrypoint: supervisor({
            tools: {
                delegate: delegateTo([resourcePlanner]),
                awareOf: awareOf(team)
            }
        }),
        description: description,
        ...
    })
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: supervisor should be configurable agent
3 participants