-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reallocting location array only when bytecode size is greater than the location size. #84
Conversation
src/lyk.c
Outdated
// YKOPT: Reallocating for every instruction is inefficient. | ||
YkLocation **new_locations = calloc(f->sizecode, sizeof(YkLocation *)); | ||
lua_assert(new_locations != NULL && "Expected yklocs to be defined!"); | ||
if (f->yklocs == NULL || f->sizecode > f->yklocs_size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should be something like:
if (f->yklocs == NULL) {
f->ykcallocs = calloc(...);
f->yklocs_size = ...;
...
}
if (f->sizecode > f->yklocs_size) {
realloc(...)
}
A related question: are sizecode
and yklocs_size
counting number of elements or number of bytes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my first implementation used realloc, but I hadn't thought of this. Is realloc
always faster than calloc
?
sizecode
keeps track of number of instructions in the bytecode array, I am guessing that yklocs_size
contains number of locations. So to answer your question, so both are counting number of elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use calloc
you allocate two chunks of memory, and have to free one. realloc
does that and it moves the old data you had.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed here: 60dca65
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am guessing that yklocs_size contains number of locations
Can you check and make sure of this please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is number of locations. Here is the commit fixing the comments: 1dee9a9
@ltratt please lmk if we need to change anything else in this PR. |
Please squash. |
@ltratt done. |
@nmdis1999 Please force push an update with an updated commit message. The title needs to be capitalised and of the form "Optimise the YkLocation buffer" and the body of the commit needs to explain why this was done: what was the problem? why this solution? are there some (summary) benchmark numbers? |
@ltratt done |
Can you force push an update without the (slightly dodgy, as it's not done properly on the RHS) |
Location array was reallocated for every instructions, which was very ineffective since we only need to reallocate the buffer either when yk location is set to NULL or when the number of bytecode instructions is greater than number of location in the yk location array. This was causing around 1000x slowdown in benchmark test such as verybig.lua. The fix improves verybig.lua by 300x and brings performance from 198sec to 0.650sec when JIT is off.
done. |
Previously location array was reallocated for every instruction, which was ineffective and caused a lot of performance overhead for programs which did not need reallocation (such as those which had fewer case when bytecode size was larger than the current location array size). This fix fetches back the performance so lost.