New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[editor][server endpoints][3/n]: Proof-of-Concept for Streaming #615

Closed

rossdanlm wants to merge 3 commits into main from pr615

Contributor

rossdanlm commented Dec 26, 2023 •

edited

Loading

[editor][server endpoints][3/n]: Proof-of-Concept for Streaming

The actual call in the run command will be a bit more complex since we'll have to attach a stream callback and probably create a text queue iterator like what we did for Gradio, but this shows that it's possible and shouldn't be too complicated.

Sorry Jonathan, I didn't use any of your super useful generic functions, but we can do that later.

A big thing I wanted to point out is that in the yield generator, I need explicitly return a string (not a json object), otherwise this will not work, so the frontend needs to be able to parse that result somehow. We should sync on this but I feel jsons are strings anyways and as long as we follow the same output "formula" (ex: in the list_models we out data key vs other commands we directly return the AIConfig json), it should be fine.

Test plan

a9d769e6-d8d6-4aad-b450-6b27a75e509a.mp4

Stack created with Sapling. Best reviewed with ReviewStack.

Rossdan Craig [email protected] added 3 commits

December 26, 2023 00:32


          [editor][server endpoints][1/n]: Add delete_prompt() event

ecbc14e

To test, follow the readme to run the backend server in a terminal, then go to another terminal and enter:

##Test plan
```
alias aiconfig="python -m 'aiconfig.scripts.aiconfig_cli'"
aiconfig edit --aiconfig-path="/Users/rossdancraig/Projects/aiconfig/cookbooks/Getting-Started/travel.aiconfig.json" --server-port=8080 --server-mode=debug_servers
curl http://localhost:8080/api/run -d '{"prompt_name":"get_activities"}' -X POST -H 'Content-Type: application/json'
```

This results in `get_activities` prompt being deleted (notice that the second prompt is now the one that appears first):
<img width="969" alt="Screenshot 2023-12-26 at 00 30 23" src="https://github.com/lastmile-ai/aiconfig/assets/151060367/5dd905c8-7cb6-4c1f-a97b-284337196506">
<img width="824" alt="Screenshot 2023-12-26 at 00 30 49" src="https://github.com/lastmile-ai/aiconfig/assets/151060367/8bfc5003-72d0-4e5a-9bf9-14e849acade7">


          [editor][server endpoints][2/n]: Add params arg to run command

3ff6048

TSIA, pretty simple, should be able to pass in params field

See previous diff for example on how to test


          [editor][server endpoints][3/n]: Proof-of-Concept for Streaming

ee06baa

The actual call in the `run` command will be a bit more complex since we'll have to attach a stream callback and probably create a text queue iterator like what we did for Gradio, but this shows that it's possible and shouldn't be too complicated.

Sorry Jonathan, I didn't use any of your super useful generic functions, but we can do that later.

A big thing I wanted to point out is that in the `yield` generator, I need explicitly return a string (not a json object), otherwise this will not work, so the frontend needs to be able to parse that result somehow. We should sync on this but I feel jsons are strings anyways and as long as we follow the same output "formula" (ex: in the `list_models` we out `data` key vs other commands we directly return the AIConfig json), it should be fine.

This was referenced Dec 26, 2023

[editor][server endpoints][2/n]: Add params arg to run command and check for undefined prompt_name #613

Merged

[editor][server endpoints][1/n]: Add delete_prompt() event #612

Closed

jonathanlastmileai reviewed

View reviewed changes

python/src/aiconfig/editor/server/server.py

-              from flask_cors import CORS
+              from aiconfig.schema import ExecuteResult, Prompt
+              from flask import Flask, Response, request, stream_with_context
+              from flask_cors import CORS # TODO: add this to requirements.txt

Contributor

jonathanlastmileai Dec 26, 2023

Add it! :D

python/src/aiconfig/editor/server/server.py

@@ @@ -114,17 +117,83 @@ def create() -> FlaskResponse: @@
                   state.aiconfig = AIConfigRuntime.create()  # type: ignore
                   return HttpResponseWithAIConfig(message="Created new AIConfig", aiconfig=state.aiconfig).to_flask_format()
+              @app.route('/api/test_streaming', methods=["POST"])

Contributor

jonathanlastmileai Dec 26, 2023

You may as well make the real endpoint. It's just us using it anyway; we're not committing to anything big here.

python/src/aiconfig/editor/server/server.py

Comment on lines +122 to +126

+                  EXCLUDE_OPTIONS = {
+                      "prompt_index": True,
+                      "file_path": True,
+                      "callback_manager": True,
+                  }

Contributor

jonathanlastmileai Dec 26, 2023

Isn't there a lot of duplication with HttpResponseWithAIConfig.to_flask_format() ?

Contributor Author

rossdanlm Dec 27, 2023

Yea you're right, I was mainly just playing around and trying to get things tested quickly

python/src/aiconfig/editor/server/server.py

+                  def generate(num_stream_steps: int):
+                      prompt : Prompt = state.aiconfig.get_prompt('get_activities')
+                      for i in range(num_stream_steps):
+                          time.sleep(1)

Contributor

jonathanlastmileai Dec 26, 2023

why are we sleeping?

Contributor Author

rossdanlm Dec 27, 2023

Just to prove that the streaming is going as planned and I can see outputs bit-by-bit

python/src/aiconfig/editor/server/server.py

Comment on lines +187 to +188

		params : str = request_json.get("params", None)
		stream : bool = request_json.get("stream", True)

Contributor

jonathanlastmileai Dec 26, 2023

Nit: remove space before :. You can autoformat in-place with the linter.

jonathanlastmileai reviewed

View reviewed changes

python/src/aiconfig/editor/server/server.py

Comment on lines +136 to +142

+                          output = ExecuteResult(
+                              output_type="execute_result",
+                              execution_count=0,
+                              data = "Rossdan" + str(i+1),
+                              metadata = {},
+                          )
+                          prompt.outputs = [output]

Contributor

jonathanlastmileai Dec 26, 2023

Should we be explicitly constructing this? This feels like unnecessary duplication

Contributor Author

rossdanlm Dec 27, 2023

You are right lol, it's kind of not the greatest. It would be better to just overwrite output.data each time, but we'd still have to initialize it first to an ExecuteResult object (not too hard, just check if output is empty array first) but yea, just a proof of concept, nothing fancy this diff yet

python/src/aiconfig/editor/server/server.py

+                          # yield_output = core_utils.JSONObject({"data": aiconfig_json})
+                          # print(f"{yield_output=}")
+                          print(f"{str(aiconfig_json)=}\n")
+                          yield str(aiconfig_json) + "\n\n"

Contributor

jonathanlastmileai Dec 26, 2023

How does this iterate over the tokens returned? Doesn't it just yield the same fixed contents of get_activities() every iteration ?

Contributor Author

rossdanlm Dec 27, 2023

The aiconfig_json is updated with different output each time. Sorry I forgot to include test video but you can see it now :)

Contributor

rholinshead commented Dec 26, 2023

So the response itself would be a literal string instead of an object with {data: string}?

A big thing I wanted to point out is that in the yield generator, I need explicitly return a string (not a json object)

Can you explain the reasoning for this? Couldn't you alternatively just yield an object instead of the string? I'm not super familiar with generators but I didn't think they were limited to primitive types?

Contributor

jonathanlastmileai commented Dec 26, 2023

So the response itself would be a literal string instead of an object with {data: string}?

A big thing I wanted to point out is that in the yield generator, I need explicitly return a string (not a json object)

Can you explain the reasoning for this? Couldn't you alternatively just yield an object instead of the string? I'm not super familiar with generators but I didn't think they were limited to primitive types?

This might be a flask limitation. @rossdanlm should help clarify. I suggest that if this works and there's any question whether flask can do it another way, and it's easy to deal with on frontend, let's just do that. For non-streaming endpoints I have a pretty good understanding what's possible and have returned JSON objects.

Contributor Author

rossdanlm commented Dec 27, 2023

Can you explain the reasoning for this? Couldn't you alternatively just yield an object instead of the string? I'm not super familiar with generators but I didn't think they were limited to primitive types?

This might be a flask limitation. @rossdanlm should help clarify.

Yea I wasn't able to get regular object to be passed, so had to convert the json to a string and it worked

I suggest that if this works and there's any question whether flask can do it another way, and it's easy to deal with on frontend, let's just do that.

Yea I agree, unless we're able to get this unblocked in 20-30 mins of investigation

rholinshead mentioned this pull request

[editor] Set Up run_prompt Callbacks #641

Merged

rholinshead added a commit that referenced this pull request


          [editor] Set Up run_prompt Callbacks (#641)

e05e14c

# [editor] Set Up run_prompt Callbacks

Setting up the callbacks for `run_prompt`. The request succeeds but
currently the response is not correct -- the aiconfig returned doesn't
have the outputs. This is on main, so will land this and test with
#615

A subsequent PR will also need to link local typescript aiconfig to our
editor package.json in order to have the updated output types until we
can publish the updated package. With the linking, I'll then add output
rendering.

## Testing:
- Make sure /run request succeeds:


https://github.com/lastmile-ai/aiconfig/assets/5060851/2cacc07c-3bfe-4a63-8e74-f912b64260dd

Contributor

rholinshead commented Dec 28, 2023

Yielding string is fine (and probably expected) since we can stream the json chunks, something like:

    def generate(num_stream_steps: int):
        aiconfig_json: str | None = None
        prompt: Prompt = state.aiconfig.get_prompt(prompt_name)
        for i in range(num_stream_steps):
            time.sleep(1)
            output = ExecuteResult(
                output_type="execute_result",
                execution_count=0,
                data="Rossdan" + str(i + 1),
                metadata={},
            )
            prompt.outputs = [output]
            print(f"Done step {i+1}/{num_stream_steps}...")

            aiconfig_json = state.aiconfig.model_dump(exclude=EXCLUDE_OPTIONS)
            # print(f"{aiconfig_json=}\n")
            # yield_output = core_utils.JSONObject({"data": aiconfig_json})
            # print(f"{yield_output=}")
            print(f"{str(aiconfig_json)=}\n")
            yield json.dumps({"output_chunk": output.model_dump()})
            # yield aiconfig_json

            # HttpResponseWithAIConfig(
            #     message=f"Done step {i+1}/{num_stream_steps}...",
            #     aiconfig=state.aiconfig,
            # ).to_flask_format()

        if aiconfig_json is None:
            aiconfig_json = state.aiconfig.model_dump(exclude=EXCLUDE_OPTIONS)
        yield json.dumps({"aiconfig": aiconfig_json})

Where each intermediate chunk is the output chunk with accumulated content and then a final output of the full aiconfig

Contributor

jonathanlastmileai commented Dec 28, 2023

Yielding string is fine (and probably expected) since we can stream the json chunks, something like:

    def generate(num_stream_steps: int):
        aiconfig_json: str | None = None
        prompt: Prompt = state.aiconfig.get_prompt(prompt_name)
        for i in range(num_stream_steps):
            time.sleep(1)
            output = ExecuteResult(
                output_type="execute_result",
                execution_count=0,
                data="Rossdan" + str(i + 1),
                metadata={},
            )
            prompt.outputs = [output]
            print(f"Done step {i+1}/{num_stream_steps}...")

            aiconfig_json = state.aiconfig.model_dump(exclude=EXCLUDE_OPTIONS)
            # print(f"{aiconfig_json=}\n")
            # yield_output = core_utils.JSONObject({"data": aiconfig_json})
            # print(f"{yield_output=}")
            print(f"{str(aiconfig_json)=}\n")
            yield json.dumps({"output_chunk": output.model_dump()})
            # yield aiconfig_json

            # HttpResponseWithAIConfig(
            #     message=f"Done step {i+1}/{num_stream_steps}...",
            #     aiconfig=state.aiconfig,
            # ).to_flask_format()

        if aiconfig_json is None:
            aiconfig_json = state.aiconfig.model_dump(exclude=EXCLUDE_OPTIONS)
        yield json.dumps({"aiconfig": aiconfig_json})

Where each intermediate chunk is the output chunk with accumulated content and then a final output of the full aiconfig

Makes sense to me!

Contributor

rholinshead commented Dec 28, 2023 •

edited

Loading

Ok, so for the streaming to work with the client side api we'll use (oboe), we need to wrap the entire response in an array, so:

first chunk should sent "[{output_chunk: output}, \n" with opening "["
next chunks send just "{output_chunk: output}, \n"
end sends "{aiconfig: aiconfig_json}]" with closing "]"

See #651 for example, just need some improvements there:

shouldn't need to yield [ or ] separately (needs some python string manipulation to add to the first chunk / end config text
can we do this for all run_prompt calls? i.e. even for non-streaming run_prompt, immediately just send [{output: output}, {aiconfig: aiconfig}]" and ideally we can use one endpoint/handling for all run prompt calls

Contributor Author

rossdanlm commented Jan 2, 2024

Closing because we have updated streaming PR in #683

rossdanlm closed this

rossdanlm deleted the pr615 branch

January 2, 2024 17:44

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

jonathanlastmileai jonathanlastmileai left review comments

saqadri Awaiting requested review from saqadri saqadri will be requested when the pull request is marked ready for review saqadri is a code owner

rholinshead Awaiting requested review from rholinshead rholinshead will be requested when the pull request is marked ready for review rholinshead is a code owner

Ankush-lastmile Awaiting requested review from Ankush-lastmile Ankush-lastmile will be requested when the pull request is marked ready for review Ankush-lastmile is a code owner

Labels

None yet