Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash consistency issue after acquiring bucket locks #19

Closed
iangneal opened this issue Apr 28, 2021 · 3 comments
Closed

Crash consistency issue after acquiring bucket locks #19

iangneal opened this issue Apr 28, 2021 · 3 comments

Comments

@iangneal
Copy link

Bug

Exposed by crashing after acquiring a lock from clht_put.

static inline int
lock_acq_chk_resize(clht_lock_t* lock, clht_hashtable_t* h)
{
char once = 1;
clht_lock_t l;
while ((l = CAS_U8(lock, LOCK_FREE, LOCK_UPDATE)) == LOCK_UPDATE)
{

  • Crashing after line 311 here causes the lock to be never released, so the restarted example waits indefinitely

Steps to reproduce

gdb --args ./example 20 1
> break clht_lb_res.h:311
> run
> next
> p *lock
# should print "$1 = 1 '\001'"
> quit
# Then, re-run
./example 20 1

The second execution should run indefinitely, waiting on acquiring the lock.

Comments

I see your comments here about locking assumptions:

// Although our current implementation does not provide post-crash mechanism,
// the locks should be released after a crash (Please refer to the function clht_lock_initialization())
clht_lock_t lock;

Does this mean this is a known issue, or does clht_lock_initialization just need to be added to clht_create? I ask because it seems that clht_lock_initialization is called in other places, just not in the recovery procedure.

@SeKwonLee
Copy link
Member

SeKwonLee commented Apr 28, 2021

Hi @Dahca ,

Thanks for the report. We are providing the pmdk version for a reference implementation using pmemobj allocator, but it is not fully tested and has no implementations (such as lock initialization and garbage collection) yet we assumed in our paper. As you also see in my comments, it is a known implementation issue caused by the absence of one of the post-crash mechanisms (lock initialization) we assumed in our paper. clht_lock_initialization was presented as a reference implementation for initializing locks if someone wants to implement post-crash mechanisms. I agree those implementations are necessary to make it properly work for actual use, but I could not find time to work on them yet.

@iangneal
Copy link
Author

Hey @SeKwonLee,

Thanks for the quick responses. I can also attempt to add a solution for this in the near future, but as I said in #18, I'll be slightly delayed by an upcoming deadline.

@SeKwonLee
Copy link
Member

We close this issue since it is known issue included in one of the assumptions presented in our original paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants