Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove placeholders #2

Merged
merged 39 commits into from
Dec 5, 2024
Merged

Remove placeholders #2

merged 39 commits into from
Dec 5, 2024

Conversation

AdrienVannson
Copy link
Contributor

Summary

Hello!

This is a first step in removing the PLACEHOLDER mechanism to greatly improve performance, make the whole code more Pythonic and simple and resolve a lot of bug. The motivation is more deeply described in danielgtaylor/python-betterproto#631

This PR completely removes placeholder values from messages. Instead, fields are always set with there default value (empty string, 0, None for optional, members of a oneof and messages, ...). Therefore, all the message field are marked as optional and can be set to None, which is coherent with the specification (see the issue).

As a consequence, this fixes the following issues:

Before merging it, I think it would be good to finish the work in progress (at least my other PR and another fix of the generation of the comments), and release a new version with all the recent bugfixes.

Performance

These changes also make it possible to remove the __getattribute__ from messages. As expected, this has a great impact on the performance.

With the following messages:

message Test {
    bool x = 1;
    int32 y = 2;
    float z = 3;
}

message List {
    repeated Test msg = 1;
}

The following code takes 3.24s on my computer with the original (master) implementation with the Rust codec, and only 2s with my PR (without the Rust codec). Thus, even if the Rust codec is not supported for now, it is not a problem since the performance are better anyway.

msg = List(msg=[Test(x=bool(i % 2), y = i, z = i / 2) for i in range(100000)])
b = bytes(msg)
msg2 = List().parse(b)
assert msg == msg2

Breaking changes

This PR introduces two small breaking changes:

  • Lazy-initialization of messages is no longer possible, since uninitialized message fields now have None as value and getattribute is no longer defined.
  • The serialized_on_wire function is now removed, since messages that were not serialized on the wire are now represented as None values. This function didn't really make sense in the first place imo, since being serialized is a property of a field of a message, not a property of a message itself (google's version has HasField, which indeed takes the field as a parameter).

Checklist

  • If code changes were made then they have been tested.
    • I have updated the documentation to reflect the changes.
  • This PR fixes an issue.
  • This PR is a breaking change (e.g. methods or parameters removed/renamed)

@AdrienVannson AdrienVannson merged commit cf8c41f into main Dec 5, 2024
19 checks passed
@AdrienVannson AdrienVannson deleted the remove-placeholders-2 branch December 5, 2024 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant