-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
patricia should use heap offset pointers #18
Comments
And how much memory would this save in a real world example? I don't see it being enough to warrant the added complexity and additional assumptions that would have to be made about the underlying platform. |
It makes no assumptions about the platform, it's just using an int32_t instead of a pointer. There's no preprocessor macros required. The memory savings are substantial because this reduces the node size from 141 to 73. That's huge. |
Talking about the node size takes no account of the size of what's actually stored in the container, which in many cases will vastly outweigh the size of the node itself. So, what's the real world memory saving for a reasonable application, like Atheme with several thousand users and accounts? |
Hello. Please do not post macro images on bug reports in Atheme projects. It does not add any signal and serves only to rile up people. Regarding the readme entry, I believe whoever made that documentation change refers to it accidentally as a hashtable there as it presents a hashtable-like API to the end-user (many comparisons may be made to I am not convinced that we should use pointer offsets in this way because it may break the build on some embedded platforms where pointers are actually offsets into a lookup table. While Mowgli itself is not used directly on such platforms, variants of the underlying code are. Do you believe your offset proposal will work in an environment where pointers do not represent direct memory addresses? Also, in general, it is my view that a radix tree still uses less memory than a hashtable when empty due to a limited number of nodes being allocated, verses at least one pointer per bucket being allocated for a hashtable. In general, I agree that 141 bytes on systems with 64-bit pointers is heavier than it should be for individual tree nodes. It should indeed be looked into. |
Also, it seems that using an offset instead of a pointer may break if a memory pool crosses more than one VMA on systems with split-range ASLR like with PaX. The reason being that the base address of the VMA may be from an entirely different range than the first VMA, for example VMAs are split evenly between 0x700000000000 VIRT_BASE and 0x300000 VIRT_BASE. I believe that your proposed change will not work safely in such an environment. A basic test program exercising the patricia routines actually is available at |
Regarding the image macro, @Aerdan observed that Github has finally added comment deletion. So, we have taken the liberty of removing the image macro and all posts related to it, as they are offtopic. |
Regarding I am assuming the former, but am seeking clarification in this regard. In the case of the former, I feel that |
Virtual memory translation tables + ASLR should have no relation to the offsets discussed here, since the heap is still contiguous. Even if you have non-contiguous allocations, that's fine because you can just index by "which block" (base address is likely a multiple of the kernel's page size) + internal block offset. Furthermore, since you'll probably have a maximum block size for dynamic growth, and therefore a maximum number of blocks which fit in your heap, your pair of offsets will occupy very few bits. Offsets-rather-than-pointers is one of the easiest and most effective techniques (you typically see this in malloc implementations) to decrease internal fragmentation. Furthermore, I think this may in fact enhance the clarity of the code, since instead of arbitrary pointers you now have explicit information about which block you're dealing with. Just my 2 cents. |
@keroserene actually, they do, because we have different mapped regions for every Nth page in the memory pool. thusly, it is possible that allocations are not contiguous, as linux will allocate from both the lowest 24 bits and highest 24 bits on amd64 (look at some stacktraces in gdb if you don't believe it). once you cross that boundary on 64-bits, a 32-bit value cannot describe the offset. it is why however, @jart clarified what she was actually after on twitter, so i will work on it this weekend (maybe earlier, but we're busy with planning-related meetings at work, so i am not sure how much time i will have between then and now). the requested change seems reasonable, the source of confusion was that from my initial read of the bug report that you wanted to store a 32-bit address offset. we will have to do this in a couple of steps:
i need to think a little on how to implement step 2, i have a few solutions but they're kinda ugly. i guess i could make the index class freestanding and make it not my problem. |
The memory overhead of your patricia container on 64-bit systems is suboptimal. You already have your own memory pools so you can very easily cut the patricia overhead in half by replacing the patricia_elem pointers with a uint32_t offset from node_heap. I'd send you a pull request, but you don't have any unit tests 👎
It would be even better to have a 24-bit offset pointer represented as (uint8_t << 16) | uint16_t. This way 32-bit systems would also be able to benefit from the lower memory usage. However such a pointer would place a limitation of having no more than 2^24 nodes. I think that's an acceptable statically-defined behavior since 2^24 patricia_elems would take up 896 megs of memory.
You should also create mowgli_patricia_sizeof() function so people can measure the costs of using the patricia data structure.
Also you should not call patricia a "hashtable" in the README because radix trees don't hash anything. You'd be better off describing it as "a radix tree dictionary that offers performance at the expense of using a hilarious amount of memory".
The text was updated successfully, but these errors were encountered: