-
-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch and error check allocations #3224
Batch and error check allocations #3224
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the PR 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a little thing to fix, but otherwise it seems fair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I was looking into optimization opportunities and found some allocation requests (mallocs and PyMem_News) right next to each other, and I thought surely calling the function once will be faster than calling it twice. It is also simpler to check if one buffer failed to allocate, and simpler to free one buffer at the end, rather than 2 or 3.
Looking through the code I also saw that the malloc statements in the blur routines weren't error checked at all, so I fixed that.
Running very favorable microbenchmarks I saw very small performance increases, like 1-3%, or no impact. I'm unsure if running a tight loop over identical calls reduces the execution time burden of malloc / PyMem_New, so maybe performance would be affected differently in a more realistic scenario?
Anyways since this is a small thing I'm not trying to pitch this as a performance PR, instead this is a code quality PR. We want our code to be the best it can be, and I think that reducing the number of allocation calls helps that goal.