-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SpeziLLMOpenAI: Replace MacPaw/OpenAI With Generated API Calls #64
base: main
Are you sure you want to change the base?
Conversation
d5a4eb3
to
3ded304
Compare
48f3c0b
to
974449b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the work here @paulhdk; very important to improve this setup and build on the OpenAPI specification!
It would be amazing to get a first insight from @philippzagar to get a good round of feedback.
@@ -51,7 +50,7 @@ public struct LLMOpenAIModelParameters: Sendable { | |||
/// - logitBias: Alters specific token's likelihood in completion. | |||
/// - user: Unique identifier for the end-user, aiding in abuse monitoring. | |||
public init( | |||
responseFormat: ChatQuery.ResponseFormat? = nil, | |||
responseFormat: Components.Schemas.CreateChatCompletionRequest.response_formatPayload? = nil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if we should add compact type aliases for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve added an LLMOpenAIRequestType
alias. Does that work for you?
Should we also introduce an alias for Components.Schemas
in general? This won’t make the types shorter, but something like LLMOpenAIGeneratedTypes could improve readability, maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can introduce well defined and named typealias for the specific types that we use in our API surface; we should see if we can make them compact and focus on them.
/// "firstName": [ | ||
/// "type": "string", | ||
/// "description": "The first name of the person") | ||
/// ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if we can add a nicely typed type for this instead of a dictionary; it can always map to a dictionary under the hood. Would be cool to avoid loosing that type-safe element?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, SpeziLLMOpenAI wrapped around the Swift types provided by the OpenAI
package, which would then eventually be passed to the API.
With the OpenAI OpenAPI spec, such types aren't generated, but the JSON schemas are instead validated for correctness as they're being encoded in the OpenAPIObjectContainer type.
Introducing such wrapper types again would require precise alignment with the OpenAI, which would make it, I could imagine, harder to maintain over time.
I could imagine that’s one reason why the official OpenAI Python package, which is also generated from the OpenAI OpenAPI specification, does not offer wrapper types either, AFAICT.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding an extension initializer/function that takes the well-typed arguments of one wants to use them would be beneficial and would avoid issues with string keys that are not correct or malformatted. Still allowing to pass in a dictionary might be an escape hatch that we can still provide. The OpenAPI surface is quite stable and if we use e.g. an enum for the type of the parameter can also have an other
case with an associated string value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I most definitely agree with @PSchmiedmayer here, I would follow the typical Swift paradigm and provide as much type safety as possible.
As mentioned by @PSchmiedmayer, I would implement well-typed inits / functions etc. that then map to the underlying String
dictionary. And yes, an escape hatch that passes the raw dictionary might be beneficial!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear and because I'm a little clueless about how to implement this easily: the only way I'm seeing here is to re-implement these two types from the MacPaw OpenAI package that SpeziLLM was using previously, and then reverting the changes around the initialisers accordingly.
SpeziLLM/Sources/SpeziLLMOpenAI/FunctionCalling/LLMFunctionParameterWrapper.swift
Lines 12 to 15 in 26b1e07
/// Alias of the OpenAI `JSONSchema/Property` type, describing properties within an object schema. | |
public typealias LLMFunctionParameterPropertySchema = ChatQuery.ChatCompletionToolParam.FunctionDefinition.FunctionParameters.Property | |
/// Alias of the OpenAI `JSONSchema/Item` type, describing array items within an array schema. | |
public typealias LLMFunctionParameterItemSchema = ChatQuery.ChatCompletionToolParam.FunctionDefinition.FunctionParameters.Property.Items |
I was not able to find a definitive documentation for the fileds that the OpenAI API accepts here, including the ones that are currently support by SpeziLLM, e.g., minItems
, maxItems
, uniqueItems
.
The function calling documentation mentions none of them: https://platform.openai.com/docs/guides/function-calling
What do you think?
Sources/SpeziLLMOpenAI/FunctionCalling/LLMFunctionParameterSchemaCollector.swift
Outdated
Show resolved
Hide resolved
bdd7127
to
1f52a7f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for continuing to work on this @paulhdk!
I had a quick sync with @philippzagar and he will take a closer look at the PR to provide insights on the different changes; would be great to update the PR to the latest version of main to resolve the conflicts; I think after the feedback from @philippzagar we should be ready to get this merged 🚀
.Input(body: .json(LLMOpenAIRequestType( | ||
messages: openAIContext, | ||
model: schema.parameters.modelType, | ||
frequency_penalty: schema.modelParameters.frequencyPenalty, | ||
logit_bias: schema.modelParameters.logitBias.additionalProperties.isEmpty ? nil : schema | ||
.modelParameters | ||
.logitBias, | ||
max_tokens: schema.modelParameters.maxOutputLength, | ||
n: schema.modelParameters.completionsPerOutput, | ||
presence_penalty: schema.modelParameters.presencePenalty, | ||
response_format: schema.modelParameters.responseFormat, | ||
seed: schema.modelParameters.seed, | ||
stop: LLMOpenAIRequestType.stopPayload.case2(schema.modelParameters.stopSequence), | ||
stream: true, | ||
temperature: schema.modelParameters.temperature, | ||
top_p: schema.modelParameters.topP, | ||
tools: functions.isEmpty ? nil : functions, | ||
user: schema.modelParameters.user | ||
))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to format this similar to our other code bases; might be applicable to other parts as well:
.Input(body: .json(LLMOpenAIRequestType( | |
messages: openAIContext, | |
model: schema.parameters.modelType, | |
frequency_penalty: schema.modelParameters.frequencyPenalty, | |
logit_bias: schema.modelParameters.logitBias.additionalProperties.isEmpty ? nil : schema | |
.modelParameters | |
.logitBias, | |
max_tokens: schema.modelParameters.maxOutputLength, | |
n: schema.modelParameters.completionsPerOutput, | |
presence_penalty: schema.modelParameters.presencePenalty, | |
response_format: schema.modelParameters.responseFormat, | |
seed: schema.modelParameters.seed, | |
stop: LLMOpenAIRequestType.stopPayload.case2(schema.modelParameters.stopSequence), | |
stream: true, | |
temperature: schema.modelParameters.temperature, | |
top_p: schema.modelParameters.topP, | |
tools: functions.isEmpty ? nil : functions, | |
user: schema.modelParameters.user | |
))) | |
.Input(body: | |
.json( | |
LLMOpenAIRequestType( | |
messages: openAIContext, | |
model: schema.parameters.modelType, | |
frequency_penalty: schema.modelParameters.frequencyPenalty, | |
logit_bias: schema.modelParameters.logitBias.additionalProperties.isEmpty ? nil : schema | |
.modelParameters | |
.logitBias, | |
max_tokens: schema.modelParameters.maxOutputLength, | |
n: schema.modelParameters.completionsPerOutput, | |
presence_penalty: schema.modelParameters.presencePenalty, | |
response_format: schema.modelParameters.responseFormat, | |
seed: schema.modelParameters.seed, | |
stop: LLMOpenAIRequestType.stopPayload.case2(schema.modelParameters.stopSequence), | |
stream: true, | |
temperature: schema.modelParameters.temperature, | |
top_p: schema.modelParameters.topP, | |
tools: functions.isEmpty ? nil : functions, | |
user: schema.modelParameters.user | |
) | |
) | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @PSchmiedmayer here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this coding style being enforced in the other code bases?
I played around with some Swiftformat rules yesterday while trying to make it produce the example above. But I was unsuccessful and had to update it manually. Is there a better way to catch all the remaining examples in the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great PR @paulhdk, it's great to move away from the OpenAI package! 🚀
Had some feedback on some parts of the PR, please also clean up the leftover compile errors in the testing code as well as possible linter issues.
Also, it would be great to get rid of some Swift warnings / deprecations that are tagged via the CI!
One more thing: Please add a small documentation (in the README and DocC) on how to (re-)generate the client (needed if the API changes at some point).
/// "firstName": [ | ||
/// "type": "string", | ||
/// "description": "The first name of the person") | ||
/// ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I most definitely agree with @PSchmiedmayer here, I would follow the typical Swift paradigm and provide as much type safety as possible.
As mentioned by @PSchmiedmayer, I would implement well-typed inits / functions etc. that then map to the underlying String
dictionary. And yes, an escape hatch that passes the raw dictionary might be beneficial!
"required": requiredPropertyNames | ||
]) | ||
} catch { | ||
logger.error("Error creating OpenAPIObjectContainer.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a fallthrough error handeling really beneficial here?
To be fair, errors should be very rare here, only when the validated container cannot be created
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fatalError()
, then? Or can you think of a nicer way to handle this?
Sources/SpeziLLMOpenAI/FunctionCalling/LLMFunctionParameterWrapper+ArrayTypes.swift
Show resolved
Hide resolved
XCTAssertEqual(schemaArrayBool.schema.items?.type, .boolean) | ||
XCTAssertEqual(schemaArrayBool.schema.items?.const, "true") | ||
schema = schemaArrayBool.schema.value | ||
items = schema["items"] as? [String: Any] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit ugly that we loose all underlying type information of the respective properties of the schema here.. Mabye some computed properties on the schema with a specific type would make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. The problem is that the underlying type for the schema is the generated OpenAPIRuntime.OpenAPIObjectContainer
type, which boils down to a [String: (any Sendable)?]
type and is what we get back from the API / generated calls.
Which schema
are you referring to here? I don't think I'm following where you would want that computed property to be.
]) | ||
} catch { | ||
print("unable to initialse schema in LLMOpenAIParameterCustomTypesTets") | ||
return .init() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens down the line if the validation fails and an empty schema is created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't something like this do the trick?
diff --git a/Tests/SpeziLLMTests/LLMOpenAIParameterTests+CustomTypes.swift b/Tests/SpeziLLMTests/LLMOpenAIParameterTests+CustomTypes.swift
index ca262bd..3bbce43 100644
--- a/Tests/SpeziLLMTests/LLMOpenAIParameterTests+CustomTypes.swift
+++ b/Tests/SpeziLLMTests/LLMOpenAIParameterTests+CustomTypes.swift
@@ -11,6 +11,10 @@ import XCTest
final class LLMOpenAIParameterCustomTypesTests: XCTestCase {
+ override func setUp() {
+ continueAfterFailure = false
+ }
+
struct CustomType: LLMFunctionParameterArrayElement, Encodable, Equatable {
static var itemSchema: LLMFunctionParameterItemSchema = {
do {
@@ -28,8 +32,7 @@ final class LLMOpenAIParameterCustomTypesTests: XCTestCase {
]
])
} catch {
- print("unable to initialse schema in LLMOpenAIParameterCustomTypesTets")
- return .init()
+ XCTFail("Unable to initialse schema in LLMOpenAIParameterCustomTypesTets")
}
}()
The compiler is still expecting a return
after the XCTFail()
. Do you have another suggestion?
Tests/UITests/TestApp/LLMOpenAI/Functions/LLMOpenAIFunctionPerson.swift
Outdated
Show resolved
Hide resolved
@paulhdk A short ping on the state of this PR; would be amazing to get it merged before we tag a new 0.X version tag in combination with the MLX changes 🚀 |
First of all, thank you @philippzagar for the review!
I"m OOF until the end of the week. Will be able to address all the feedback here with full force starting next week Monday. I'm sorry for letting this sit so long, thaz wasn't my plan either .. |
Thank you @paulhdk! 🚀 |
4a70bb7
to
0c5505d
Compare
endpoint This enables swift-openapi-generator to generate streamed responses. See: openai/openai-openapi#311
Modifier was mistakenly removed in 99e0e6d
0e1eb92
to
3ded7a0
Compare
@paulhdk Great to see tons of progress on the PR! 🥳 |
Hey @paulhdk, |
I've been sick all week; today is the first day where I feel ready to do some work. The wrapper types suggested by you and @PSchmiedmayer are still missing and will require some thinking on my part. Doing my best to get all of this done ASAP. |
…eterWrapper.swift
This reverts commit 1a6cc12.
Co-authored-by: Paul Schmiedmayer <[email protected]>
@philippzagar I've left questions for the remaining comments above where I need your help. Thank you for taking another look! |
SpeziLLMOpenAI: Replace MacPaw/OpenAI With Generated API Calls
♻️ Current situation & Problem
This PR replaces the MacPaw/OpenAI package with generated API calls by the swift-openapi-generator package.
Calls are generated from OpenAI's official OpenAPI spec.
Closes: #49
This PR does not add any new features but simply replicates the existing feature set with the generated API calls.
I've tried my best to keep track of any known issues in-code with FIXMEs as well as in the following list.
Current Issues
Sources/SpeziLLMOpenAI/LLMOpenAISession+Generation.swift
does not handle the "[DONE]" message sent by the API to conclude a stream. There is currently a hacky workaround that catches the error that is thrown in that case. I'm not quite sure yet how to handle that case elegantly.LLMFunctionParameterItemSchema
type does not use a generated type yet.SpeziLLMOpenAI/FunctionCalling
should be, if possible, refactored, as they currently have a lot of optional bindings.openapi-generator-swift
expects and anopenapi.yaml
and a configuration file in the TestApp, which is why there are duplicate openapi specs and configuration files in this PR. I'm not quite sure why it's expecting them in the TestApp, but I suspect it has something to do with the generated types being used in the TestApp's model selection mechanism.⚙️ Release Notes
📚 Documentation
As no new functionality is added, nothing should change here (unless I missed something).
✅ Testing
This PR passes the existing tests. Since no new functionality has been added, I believe this should suffice.
📝 Code of Conduct & Contributing Guidelines
By submitting creating this pull request, you agree to follow our Code of Conduct and Contributing Guidelines: