[python] Use OutputDataWithValue type for function calls #611

rossdanlm · 2023-12-25T23:01:58Z

[python] Use OutputDataWithValue type for function calls

This is extending #605 where I converted from model-specific response types --> pure strings. Now I'm going to also support function calls --> OutputData types

Same as last diff (#610) except now for python files instead of typescript

Don't need to do this for the image outputs since I already did that in #608

Actually like typescript, the only model parser we need to support for function calling is openai.py

saqadri

I'm wondering if we're better off keeping strings just as strings. Thoughts?

...ngFace/python/src/aiconfig_extension_hugging_face/remote_inference_client/text_generation.py

saqadri

A few blocking comments, but otherwise looks good. Really nice work! Not an easy change

saqadri · 2023-12-27T23:51:40Z

extensions/Gemini/python/src/aiconfig_extension_gemini/Gemini.py

+                # Gemini does not support function calls so shouldn't
+                # get here, but just being safe
+                return json.dumps(output_data.value, indent=2)
+            raise ValueError("Not Implemented")


Remove? Or update string with a more descriptive error

It should never actually get here but yea I can remove

Edit: Ok I can't remove, I added more detailed error message so it's clear why

saqadri · 2023-12-27T23:53:16Z

...ns/HuggingFace/python/src/aiconfig_extension_hugging_face/local_inference/text_generation.py

-        accumulated_message += new_text
-        options.stream_callback(new_text, accumulated_message, 0)
-        output.data = accumulated_message
+        if isinstance(new_text, str):


What about the case where new_text isn't str? Can that happen?

Edit: It's actually possible (though practically unlikely), see comments in #611 (comment). I get what you're saying (this change seems unnecessary and more confusing) so I'll remove this type of logic from this diff

No it can't happen for hugging face text generation model parser

saqadri · 2023-12-27T23:54:16Z

...ns/HuggingFace/python/src/aiconfig_extension_hugging_face/local_inference/text_generation.py

+                # HuggingFace Text generation does not support function
+                # calls so shouldn't get here, but just being safe
+                return json.dumps(output_data.value, indent=2)
+            raise ValueError("Not Implemented")


nit: update string comment

I can remove the value error, but why remove the comment?

...ngFace/python/src/aiconfig_extension_hugging_face/remote_inference_client/text_generation.py

extensions/LLama-Guard/python/python/src/aiconfig_extension_llama_guard/LLamaGuard.py

saqadri · 2023-12-28T00:21:47Z

python/src/aiconfig/default_parsers/openai.py

+            role = output.metadata.get("assistant", None) or \
+                ("rawResponse" in output.metadata and
+                    output.metadata["rawResponse"].get("assistant", None))


This is incorrect. Can you check if TS or Python implementations have this issue elsewhere in your stack?

Suggested change

role = output.metadata.get("assistant", None) or \

("rawResponse" in output.metadata and

output.metadata["rawResponse"].get("assistant", None))

role = output.metadata.get("role", None) or \

("rawResponse" in output.metadata and

output.metadata["rawResponse"].get("role", None))

Also perhaps we should default role to "assistant"

Yea, just copy pasted wrong my bad, should be role.

Can you check if TS or Python implementations have this issue elsewhere in your stack?

I don't have this issue anywhere else (https://github.com/lastmile-ai/aiconfig/pull/610/files#diff-e044912cdb054134cac74868d76969daa92c6dd7c1238805672e023f9b7aa92d), this was just a typo

Also perhaps we should default role to "assistant"

We shouldn't be doing this. Role can be one of 5 different types, not just "assistant", so we need to check for that explicitly: https://github.com/openai/openai-node/blob/b595cd953a704dba2aef4c6c3fa431f83f18ccf9/src/resources/chat/completions.ts#L514

python/src/aiconfig/default_parsers/openai.py

saqadri · 2023-12-28T00:27:57Z

python/src/aiconfig/default_parsers/openai.py

+                elif isinstance(output_data, OutputDataWithValue):
+                    if isinstance(output_data.value, str):
+                        content = output_data.value
+                    elif output_data.kind == "tool_calls":
+                        assert isinstance(output, OutputDataWithToolCallsValue)


I feel like this can be simplified:

if isinstance(output_data, str): # Do string stuff elif isinstance(output_data, OutputDataWithToolCallsValue): # Do function call stuff elif isinstance(output_data, OutputDataWithVale): # Do other value stuff where data.value is string elif isinstance(output_data, ChatCompletionMessage): # Do chat completion stuff

I find this more complicated and unintuitive. OutputDataWithToolCallsValue is a subset of OutputDataWithValue. I get what you're saying that you can have single else-if instead of nested if-statements, but it's jarring to go from "small subset" --> "bigger subset" while reading down through if-else logic.

This is also more confusing because the inner # Do string stuff, # Do function call stuff etc all need to individually check for the role type being assistant before we can do anything

python/src/aiconfig/default_parsers/openai.py

saqadri · 2023-12-28T00:30:36Z

python/src/aiconfig/default_parsers/openai.py

+        if message.get("content", None) is not None:
+            output_data = message.get("content")
+        elif message.get("tool_calls", None) is not None:
+            tool_calls = [ToolCallData(


nit: this is fine for now, but we might need to update this in the future as tool_calls include things that aren't just function calls

Good callout, I'll do another check where I'll check for the type being 'function'

rholinshead · 2023-12-28T00:16:13Z

...ns/HuggingFace/python/src/aiconfig_extension_hugging_face/local_inference/text_generation.py

-        accumulated_message += new_text
-        options.stream_callback(new_text, accumulated_message, 0)
-        output.data = accumulated_message
+        if isinstance(new_text, str):


Can this be anything other than string? These changes seem unnecessary if I'm understanding the streamer correctly

Streamer can actually contain anything, doesn't have to be text. Theoretically it always be str (except for the stop signal), but I'm just being safe. This is a fork of the HF source code (couldn't find it publicly on GH, but it's this: https://github.com/PaddlePaddle/PaddleNLP/blob/a55039cc16bbcfa06b204a5f21b813e98a3f65ca/paddlenlp/generation/streamers.py#L197-L206)

rholinshead · 2023-12-28T00:17:31Z

...ns/HuggingFace/python/src/aiconfig_extension_hugging_face/local_inference/text_generation.py

+                # HuggingFace Text generation does not support function
+                # calls so shouldn't get here, but just being safe
+                return json.dumps(output_data.value, indent=2)
+            raise ValueError("Not Implemented")


For these cases we could technically json.dumps(output_data), right?

I should never be possible to get to this point, which is why I raised an error to be cautious. But yea I can just remove error and fall into the final return "" instead

rholinshead · 2023-12-28T00:21:59Z

...ngFace/python/src/aiconfig_extension_hugging_face/remote_inference_client/text_generation.py

+        # If response_includes_details is false, `iteration` will be a string,
+        # otherwise, `iteration` is a TextGenerationStreamResponse


Should we update response type to Iterable[Union[TextGenerationStreamResponse, str]] to reflect this?

Good suggestion, this is what HuggingFace does too actually (the iterables are for streaming):

I'll be bit more explicit and use Union[Iterable[str], Iterable[TextGenerationStreamResponse]]

extensions/LLama-Guard/python/python/src/aiconfig_extension_llama_guard/LLamaGuard.py

extensions/llama/python/llama.py

python/src/aiconfig/default_parsers/openai.py

extensions/llama/python/llama.py

python/src/aiconfig/default_parsers/palm.py

rholinshead · 2023-12-28T14:57:33Z

For testing, can you please be sure to run the tests and the relevant cookbooks/demo scripts (e.g. to test function calling)

rossdanlm · 2023-12-28T16:37:19Z

Update: Created issue to track this in #659

For testing, can you please be sure to run the tests and the relevant cookbooks/demo scripts (e.g. to test function calling)

I ran the automated tests and function calls still work (notably the test_openai_util.py file). I also went through the python function call cookbook just to be safe and it still works, but I'm not going to do the other 10-ish that we have

I get what you're saying just to test the relevant cookbook, so this below isn't to disagree with your comment, just sharing general thoughts to flag:

I would be surprised if all the cookbooks still worked after these output refactoring PRs from last few days, since we're changing the way we output data and thus changing how we should access it. I feel having automated testing for the cookbooks should be P0 after we finish local editor MVP. I've mentioned to Sarmad that it's simply not feasible to test them all manually. If any of them are broken now, to me having automated testing + fixing them is a separate work-project

This is extending #605 where I converted from model-specific response types --> pure strings. Now I'm going to also support function calls --> OutputData types Same as last diff (#610) except now for python files instead of typescript Don't need to do this for the image outputs since I already did that in #608 Actually like typescript, the only model parser we need to support for function calling is `openai.py`

See comment here for context: #611 (comment) Separating this diff from #611 to make it easier and get that one unblocked

Explicit str type checking in model parsers See comment here for context: #611 (comment) Separating this diff from #611 to make it easier and get that one unblocked

See comment here for context: lastmile-ai#611 (comment) Separating this diff from lastmile-ai#611 to make it easier and get that one unblocked

rossdanlm force-pushed the pr611 branch from 41872eb to 66ff78d Compare December 26, 2023 04:40

rossdanlm marked this pull request as ready for review December 26, 2023 07:45

rossdanlm requested review from saqadri, rholinshead, suyoglastmileai, Ankush-lastmile and jonathanlastmileai as code owners December 26, 2023 07:45

saqadri reviewed Dec 26, 2023

View reviewed changes

...ngFace/python/src/aiconfig_extension_hugging_face/remote_inference_client/text_generation.py Show resolved Hide resolved

rossdanlm marked this pull request as draft December 27, 2023 15:07

rossdanlm changed the title ~~[python] Save output.data with OutputData type instead of pure string~~ [python] Use OutputDataWithValue type for function calls Dec 27, 2023

rossdanlm force-pushed the pr611 branch 6 times, most recently from bf687d2 to 8d3e926 Compare December 27, 2023 22:42

rossdanlm marked this pull request as ready for review December 27, 2023 22:47

saqadri approved these changes Dec 28, 2023

View reviewed changes

rholinshead reviewed Dec 28, 2023

View reviewed changes

extensions/llama/python/llama.py Outdated Show resolved Hide resolved

rholinshead reviewed Dec 28, 2023

View reviewed changes

python/src/aiconfig/default_parsers/palm.py Outdated Show resolved Hide resolved

rossdanlm force-pushed the pr611 branch from 8d3e926 to 6db3e88 Compare December 28, 2023 17:37

rossdanlm force-pushed the pr611 branch from 6db3e88 to f1248b5 Compare December 28, 2023 17:42

rossdanlm merged commit 7329eef into main Dec 28, 2023
2 checks passed

rossdanlm deleted the pr611 branch December 28, 2023 17:47

rossdanlm pushed a commit that referenced this pull request Dec 28, 2023

Explicit str type checking in model parsers

97bc1f8

See comment here for context: #611 (comment) Separating this diff from #611 to make it easier and get that one unblocked

rossdanlm mentioned this pull request Dec 28, 2023

Explicit str type checking in model parsers #648

Merged

rossdanlm pushed a commit that referenced this pull request Dec 28, 2023

Explicit str type checking in model parsers

87e5390

See comment here for context: #611 (comment) Separating this diff from #611 to make it easier and get that one unblocked

rossdanlm mentioned this pull request Dec 28, 2023

Add automated testing for cookbooks #659

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python] Use OutputDataWithValue type for function calls #611

[python] Use OutputDataWithValue type for function calls #611

rossdanlm commented Dec 25, 2023 •

edited

Loading

saqadri left a comment

saqadri left a comment

saqadri Dec 27, 2023

rossdanlm Dec 28, 2023

rossdanlm Dec 28, 2023

saqadri Dec 27, 2023

rossdanlm Dec 28, 2023 •

edited

Loading

saqadri Dec 27, 2023

rossdanlm Dec 28, 2023

saqadri Dec 28, 2023

saqadri Dec 28, 2023

rossdanlm Dec 28, 2023

saqadri Dec 28, 2023

rossdanlm Dec 28, 2023

saqadri Dec 28, 2023

rossdanlm Dec 28, 2023

rholinshead Dec 28, 2023

rossdanlm Dec 28, 2023

rholinshead Dec 28, 2023

rossdanlm Dec 28, 2023 •

edited

Loading

rholinshead Dec 28, 2023

rossdanlm Dec 28, 2023 •

edited

Loading

rholinshead commented Dec 28, 2023

rossdanlm commented Dec 28, 2023 •

edited

Loading

		# If response_includes_details is false, `iteration` will be a string,
		# otherwise, `iteration` is a TextGenerationStreamResponse

[python] Use OutputDataWithValue type for function calls #611

[python] Use OutputDataWithValue type for function calls #611

Conversation

rossdanlm commented Dec 25, 2023 • edited Loading

saqadri left a comment

Choose a reason for hiding this comment

saqadri left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rossdanlm Dec 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rossdanlm Dec 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rossdanlm Dec 28, 2023 • edited Loading

Choose a reason for hiding this comment

rholinshead commented Dec 28, 2023

rossdanlm commented Dec 28, 2023 • edited Loading

rossdanlm commented Dec 25, 2023 •

edited

Loading

rossdanlm Dec 28, 2023 •

edited

Loading

rossdanlm Dec 28, 2023 •

edited

Loading

rossdanlm Dec 28, 2023 •

edited

Loading

rossdanlm commented Dec 28, 2023 •

edited

Loading