-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stack implementation and tests #4116
Conversation
- some changes in sln to build with v143 build tools (VS 2022) - 2 new pjsystest project configuration to build as Debug-Dynamic and Release-Dynamic - stack implementation and testing incorporated into pjlib and pjlib-test projects
Thank you for the contribution. Perhaps you can tell us the reason of the introduction of this new data structure? (i.e. what future PR/feature do you plan to submit that will require the usage of stack?) |
Sorry for the delay in replying, I will definitely reply, but I will be busy for a couple more days. |
Sorry for the delay in replying,
|
No worries at all about the delay, we are preparing for the final testing of 2.15 release so most likely, we can't merge this until the release anyway. It's still not obvious to me from your examples when you need to use the stack? Typically in the SIP context, we use queue since we need to process the messages/events that come first (FIFO). In what specific case would you use LIFO? |
For iocp, I would like to invite you to check PR #4136 and let us know your feedback, such as do you also encounter similar issue; if yes, have you also fixed it in your code; will that PR potentially conflict with what you're doing, etc. |
About iocp and PR #4136: I saw and fixed some issues with op_key reuse (WSAOVERLAPPED). I implemented a reference counting mechanism on key (OS HANDLE) that ensures that key is returned to free_list when iocp reports all pending operations complete (and we don't need closing_list now, only active_list and free_list). Of course, I should take a closer look at PR #4136 to comment more. |
I added stack_stress_test() as a one of the typical examples when we can use the stack (reserving an empty slot in a large array without having to lock the entire array). |
decreased the repeat counter increased the number of threads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just need to clean up the project files and you should be mostly good to go.
For an example of a clean project patch, you can check PR 4132 (https://github.com/pjsip/pjproject/pull/4132/files#diff-0c444d946963ac7a6c002817133fdab6c0cef69c43bd85eff0dcd9c6419ed7ad)
cleaned! i.e. line ending fixed |
Anything is possible, but when I said complexity, I meant this: if we insert "new" before or after "old", we have to make sure that "old" still exists in the container. I think this requires some external synchronization, and if so, I think we have to use this synchronization with all other operations.
Yes, there are many such examples but there are also opposite situations showing that external synchronization is not always optimal. A little search in the single file pjsu_core.c (only because I have fixed "Merged request detected" in this file some days ago) pjsua_transport_create()
on_return: We have an unnecessary long lock, but with pj_stack this can be done it like this: I don't think we often call this function concurrently :) but pjsip uses such an algorithm everywhere and in some cases this global locking may lead to serious perfomance degradation. another example from the same file mod_pjsua_on_rx_request() These examples just show that external locking is not always a good choice, and internal locking is not always a bad choice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be all from me.
pjlib/include/pj/stack.h
Outdated
|
||
#endif // PJ_STACK_IMPLEMENTATION | ||
|
||
#include <pj/stack.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this recursive self inclusion intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mistake! Thanks!
pjlib/include/pj/stack.h
Outdated
* be aligned by 8 (for x86) or 16 (for x64) byte. | ||
* pjsip build system define PJ_POOL_ALIGNMENT macro to corresponding value. | ||
* winnt.h define MEMORY_ALLOCATION_ALIGNMENT macro for this purpose. | ||
* To use this macro in build system we recomend (this is optional) to add #include <windows.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps better to put the "optional" note in the beginning of the paragraph, so it's clear to the reader.
pjlib/include/pj/stack.h
Outdated
* Stack in PJLIB is single-linked list with First In Last Out logic. | ||
* Stack is thread safe. Common PJLIB stack implementation uses internal locking mechanism so is thread-safe. | ||
* Implementation for Windows platform uses locking free Windows embeded single linked list implementation. | ||
* The performance of pj_stack implementation for Windows platform is 2-5x higher than cross-platform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting numbers here without test data may not be wise.
So may be just put "considerably" higher/faster.
pjlib/src/pj/stack.c
Outdated
* because the item count can be changed at any time by another thread. | ||
* For Windows platform returns the number of entries in the stack modulo 65535. For example, | ||
* if the specified stack contains 65536 entries, pj_stack_size returns zero. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pjlib/src/pj/stack_win32.c
Outdated
* because the item count can be changed at any time by another thread. | ||
* For Windows platform returns the number of entries in the stack modulo 65535. For example, | ||
* if the specified stack contains 65536 entries, pj_stack_size returns zero. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
for (i = 0; !rc && i < PJ_ARRAY_SIZE(tests); ++i) { | ||
tests[i].state.pool = pool; | ||
rc = stack_stress_test(&tests[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we should by default disable the stress test for non-Windows platform (create a macro such as #HAS_STRESS_TEST PJ_WIN32).
It will only burden the CI machines and there's little point in stress testing a relatively slower implementation of regular LIFO list + mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need multithreaded testing for thread safe api, but currently the default implementation is cross platform so may be just Windows testing is enough.
Create a macro HAS_MT_STACK_STRESS_TEST
all fixed!
These should be all from me
I hope this will be the case this time! :)
True, of course. I guess this may also answer your previous question? |
@nanangizz |
# Conflicts: # pjlib/build/pjlib.vcxproj.filters
Hi! |
Hello, colleagues! About pj_stack... |
Hi @LeonidGoltsblat, first of all thanks for your contribution, and your time for submitting this. We're just back from (pjsip) office holiday, so coding is definitely slow this time of year. :) I've read your submission, and here are my comments. As others have said, my first reaction is, why do we need this? What problem does it solve, or what enhancement does it offer? I don't see your patch is addressing any of these two questions quantitatively. For example, if it proposes significant speed improvements, I would like to be convinced with the numbers. First the theoretical performance improvement (e.g. how long to On the other hand, this is quite a significant submission, as it modifies some of our core header files, adds third party copyright (i.e you), and imposes certain alignment on all pool memory allocations. I would probably only check the alignment in the So I tend to say we keep this aside for now until we can be convinced there is a real, significant usage for it. |
Hi Benny, Merry Christmas and thanks for the feedback! I'm currently preparing a PR with a parallel version of the conference bridge. Hopefully it'll only take a few days, then I'll be back to discuss and happy to answer any questions you may have. |
The parallel conf bridge sounds exciting! But if the purpose is to show another sample implementation where the lock free stack can be used, I'm afraid this is sounding more and more like the stack is a solution looking for a problem :) |
Just an idea, perhaps it's better renaming the data structure from stack to something like The reason is simple. Using stack to solve the various problems in the other PR seems kind of strange, and as @bennylp pointed out, sounds forced as if the stack is made to solve something it's not supposed to. But with the more appropriate name, suddenly it just seems natural. To solve the problem of resource allocation (such as unused slots) stored in an array that requires synchronisation, it makes sense IMO to use an And the data structure also opens the door to be used elsewhere in the library that currently uses the doubly-linked list declared in |
I'm closing this for now, with the following suggestion if similar work is to be resubmitted in the future:
|
|
|
This is a stack container implementation for the pjsib library (pj_stack), i.e. the single-linked list with First In Last Out logic (FILO).
The stack is implemented as close as possible to the implementation of pj_list.
This implementation is thread safe. Common implementation uses internal locking mechanism so is thread-safe.
But implementation for Windows platform uses locking free Windows embeded single linked list implementation, which makes this implementation exceptionally fast.
This is one of the basic mechanisms used in a series of subsequent pull requests with the overall goal of improving pjsip's performance.
The pull request contains tests that can be used as usage examples.
pjlib/src/pj/stack.c is the common stack implementation
pjlib/src/pj/stack_win32.c is the implementation for Windows platform
see more info in embedded documentation