Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JSON as the wire format in Gearman jobs #1931

Merged
merged 6 commits into from
May 24, 2024

Conversation

sevein
Copy link
Member

@sevein sevein commented Apr 23, 2024

This pull request replaces all uses of pickle with json. The use of pickle was introduced back in 2011 when MCP was integrated with Gearman due to its ability to round-trip Python objects, at the cost of lower security. However, apart from datetime.datetime and uuid.UUID, we are not encoding any complex structures that cannot be easily represented in JSON.

Please notice that the use of orjson was an afterthought and that's why it was submitted as an extra commit. orjson is significantly faster compared to the standard library json module, being faster than pickle protocol=0 that we were using previously. It also supports native serialization of datetime and UUID objects.

Fixes archivematica/Issues#1673.

@sevein sevein force-pushed the dev/issue-1673-gearman-json-encoding branch from e4b271b to fcde702 Compare April 23, 2024 04:09
@sevein sevein added the AMAUAT Issues relating to the improvement of the AM Automated Acceptance tests label Apr 23, 2024
@sevein sevein force-pushed the dev/issue-1673-gearman-json-encoding branch from 4cef34f to 64bcdcd Compare April 23, 2024 11:43
@sevein sevein marked this pull request as ready for review April 23, 2024 12:23
@sevein sevein requested a review from a team April 23, 2024 12:23
@sevein sevein added AMAUAT Issues relating to the improvement of the AM Automated Acceptance tests and removed AMAUAT Issues relating to the improvement of the AM Automated Acceptance tests labels Apr 23, 2024
@sevein sevein marked this pull request as draft April 23, 2024 13:44
@sevein sevein removed the request for review from a team April 23, 2024 13:44
@sevein sevein added AMAUAT Issues relating to the improvement of the AM Automated Acceptance tests and removed AMAUAT Issues relating to the improvement of the AM Automated Acceptance tests labels Apr 23, 2024
@sevein sevein force-pushed the dev/issue-1673-gearman-json-encoding branch 2 times, most recently from 3cc2774 to e91a496 Compare April 25, 2024 18:36
@sevein sevein added the AMAUAT Issues relating to the improvement of the AM Automated Acceptance tests label Apr 25, 2024
@sevein sevein marked this pull request as ready for review April 25, 2024 20:24
@sevein sevein added AMAUAT Issues relating to the improvement of the AM Automated Acceptance tests and removed AMAUAT Issues relating to the improvement of the AM Automated Acceptance tests labels Apr 27, 2024
Copy link
Member

@replaceafill replaceafill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @sevein, thank you!

I think this makes the interaction with Gearman clearer and easier to debug 🙇 and I like that the use of orjson is nicely abstracted in a single module.

hack/docker-compose.yml Outdated Show resolved Hide resolved
sevein added 2 commits May 24, 2024 15:39
orjson is significantly faster and comes with native serialization of some
types like datetime or UUID.
This commit modifies the MCPServer to encode the `createdDate` of tasks using
`datetime` which is natively supported by our JSONDataEncoder. The conversion
to the METS-compliant format now occurs during the decoding process on the
client side. This change ensures consistent use of RFC 3339 format across all
communications.
@sevein sevein force-pushed the dev/issue-1673-gearman-json-encoding branch from 02cf944 to 11c72fa Compare May 24, 2024 15:40
@sevein sevein merged commit c464570 into qa/1.x May 24, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AMAUAT Issues relating to the improvement of the AM Automated Acceptance tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problem: use of pickle for data serialization is disproportionate
2 participants