-
-
Notifications
You must be signed in to change notification settings - Fork 630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimise dumping to reduce unnecessary overhead #1649
base: dev
Are you sure you want to change the base?
Optimise dumping to reduce unnecessary overhead #1649
Conversation
This lines up pretty well with the discussion in #805. I think the main difference is that we also proposed caching logic that depends on the object type. I think the scope of this PR would be a good incremental improvement. |
I didn't even know about #805, good to know! My ideas were something similar: figure out a way to cache values for certain types at least. For example, if I know that a specific field is a string property and it is 100% there, there should be no reason to have complicated lookups or enforcing types. I didn't have time to delve deeper into the various Field subclasses, but As I'm typing this, I wonder if it would be possible to do something like lima is doing, and compile a single function which dumps the entire object, where each Field subclass could in theory produce the "most optimized" serializer for itself. |
4f04fbe
to
595b12b
Compare
595b12b
to
ee835bf
Compare
I don't know if this is of any interest, but with some minimal changes, marshmallow can be cythonized with setuptools-cythonize and together with this MR, the performance is around ~45% better! |
I'm not familiar with cythonize. Does cythonization itself - without this PR - bring a significant improvement? Is there anything we could do to make it available more easily? Like, could/should we distribute binary packages? Would it be useful to many users or is it a niche? |
I tested it without this MR a long time ago, so I don't have the numbers, but IIRC it did bring around 10% of speed on its own. But I guess that should be examined in more detail. I'm honestly not sure how niche it is, I started looking into it because I ran into these performance problems when dumping large(ish) datasets with marshmallow, but didn't want to give up dynamic schemas & validation that it provides. I haven't had the time yet, but I was hoping to look into speeding up individual fields as well, I feel like there is a lot of overhead there which should be easy to eliminate without breaking anything. |
ee835bf
to
df37ea1
Compare
When dumping many objects, marshmallow is calling the same field methods over and over again, which return the same values. Parts of this process can be called only once per dump, which reduces python method call overhead significantly. `Field.get_serializer` returns the optimized serializer for the current dump operation, avoiding the expensive lookups for properties which will not change during a single dump (such as `data_key`, `default`, etc) Also, the default `Schema.get_attribute` is also not used because all it does is calling `utils._get_value_for_key(s)`.
df37ea1
to
622d8d0
Compare
NB: Please take this MR more as a "starting a discussion" than "production-ready" code, because I hacked a lot of things.
When dumping many objects, marshmallow is calling the same field methods
over and over again, which return the same values. Parts of this process
can be called only once per dump, which reduces python method call
overhead significantly.
Field.get_serializer
returns the optimized serializer for the currentdump operation, avoiding the expensive lookups for properties which will
not change during a single dump (such as
data_key
,default
, etc)Also, the default
Schema.get_attribute
is also not used because all itdoes is calling
utils._get_value_for_key(s)
.Benchmarks show around 30-35% improvement, which is quite significant even for this hacky patch:
My motivation here is that marshmallow is excellent when it comes to schema validation, but according to the benchmarks, there is a lot of overhead in there. The question is, is there a better way of improving serialization performance without sacrificing all the good things about marshmallow?