Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash tables #1661

Merged
merged 40 commits into from
Jan 22, 2025
Merged

Hash tables #1661

merged 40 commits into from
Jan 22, 2025

Conversation

Bike
Copy link
Member

@Bike Bike commented Jan 22, 2025

Rewrites and fixes weak hash tables and weak pointers.

Bike added 30 commits January 22, 2025 10:07
Move the important code into gctools where it belongs, add support
for "weak pointers" of immediate objects (because whether something
is immediate is an implementation detail we shouldn't expose).
with many caveats.
this looking for a key simultaneously thing is unused and unclean.
but not gethash, yet, because it gets complicated
it's way too ad hoc - it takes a string stream and a filename and
writes to standard output, and calls itself with the print
argument flipped anyway? If we run into hash table problems again
we can come up with something cleaner or impermanent.
the sxhashKey deferring to static functions in HashTable_O thing
has been bugging me for years. begone. Also moves hash stuff out
of HashTable_O which is nice for organization.

Another bonus: equalp hash on a string no longer conses. I hadn't
realized it consed.
Returning a pointer to the inside of a Lisp object is something to
avoid, and moreso if it's a bespoke structure.

This change breaks Cando, but I've looked at Cando's code and it
should work with this fine once I update it.
seriously unnecessary. I assume the intent was to avoid consing up
a double float object for defaulting, but we do have singles.
again trying to present a coherent API. Also delete an unused
function that uses KeyValuePair directly.
i'm reasonably sure this should never be tripped: no_key is not
usually available as a value, so nobody's storing it in hash tables
from Lisp. In C++ we only put it in the value as the default when
we're resizing a table, and in that case we make the key no_key as
well. So the only way you'd have a key that's not no_key and a
value that is no_key is I think if your hash table is thread unsafe
and you store a new key/value in it while another thread looks for
that key, and if that's what you're doing you're in crazytown to
begin with so the test is not going to help.
The idea is to let external code (e.g. Cando) use these createEq
whatevers instead of requiring a HashTableEq type they don't need.
This will eventually let me remove the test types or at least make
them an internal implementation detail.
this was an optimization of uncertain utility that locks in the
type hierardhy, so away it goes
don't need this skulduggery.
The old implementation is still around for now because there's a
lot to delete and I'm not sure these new tables will be scanned
correctly for snapshots.
the table has to be searched, since afaik there's no way to make
boehm decrement a counter when it disappears a pointer. I kept the
_Count field so that we don't have to do that iteration when
checking for rehash (i.e. every time we setf gethash).
It's still weak, just not a WeakKeyHashTable, which is of course
the old code that will be removed.
apparently we don't have a test for this. FIXME
Probably need to make the scanner a little more sophisticated for
ephemerons and stuff. weak_scan is busted down to nothing here.
I don't think it really does us any good. Each kind of weak object
pretty much needs to be scanned (or not) in its own special way so
grouping them doesn't accomplish anything. If we ever have separate
pools for them like we did with MPS, they can just have normal
general headers.
Not useful yet but it took me a while to figure it out
This should allow us to smoothly support objects with Lisp pointers
more than 62 words in.

Also, I don't think the Lisp object is even actually that big
right now - presumably it used to be - so this might save us a
microsecond of scanning time as we use the bitmap.
Bike added 8 commits January 22, 2025 10:09
Kind of inelegant but seems to work. The separate weak pointer
scanning could be used for implementing a garbage collector.

Also clears out some of the debugging stuff - we can do the field
check when we set up the field info, not every single time we scan
and make them traversable through ROOM

There may be an issue with the previous commit alone - it may not
allocate enough field infos/layouts in gc_boot.

The fact the memory walker uses T_O** fields is kind of dicey since
not all fields are actually pointers to pointers. We have the
hidden pointers in ephemerons, but also atomics in conses etc.
Saves us a dumb lookup (in find) and will be nice for keeping the
value alive when doing weak tables.
Since the analyzer knows about WeakPointer now, there's no point
in making it generic
They don't and I think can't work with boehm gc, but the code's
there for later.
most importantly, test that they actually are weak. I don't think
they were before my rewrite.
@Bike Bike merged commit e44cf82 into main Jan 22, 2025
5 of 9 checks passed
@Bike Bike deleted the hash-tables branch January 22, 2025 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant