Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock.acquire throws NoNodeError #563

Open
telaviv opened this issue May 29, 2019 · 3 comments
Open

Lock.acquire throws NoNodeError #563

telaviv opened this issue May 29, 2019 · 3 comments
Labels

Comments

@telaviv
Copy link

telaviv commented May 29, 2019

We're seeing in our stack traces that a NoNodeError is thrown here

node = self.client.create(self.create_path, self.data,
quite a bit. I know that it should be impossible for it to happen because of this line here:
self._ensure_path()
but that's what we're seeing.

Not sure if this is a client bug or a server bug

@StephenSorriaux
Copy link
Member

Hi,

What do you have on your Zookeeper server logs?

@teeeg
Copy link

teeeg commented May 31, 2019

Is it possible this is a result of race conditions?

Because Kazoo doesn't yet (i think) support Container nodes, it's nice to clean up a lock's parent node:

lock.acquire()
lock.release()
client.delete(lock.path) suppressing NotEmptyError exceptions

But I wonder if multiple clients contending for the same lock can cause problems in the case that client1 does path cleanup sometime in the middle of the client2 lock acquisition. My naïve idea is that it would be sometime AFTER client2 has invoked

self._ensure_path()

but BEFORE
node = self.client.create(self.create_path, self.data,

Here's a test where I try to simulate that and got the NoNodeError

    def test_lock_race_conditions_delete_lock_path_during_acquire(self):
        event1 = self.make_event()
        lock1 = self.client.Lock(self.lockpath, "one")
        thread1 = self.make_thread(target=self._thread_lock_acquire_til_event,
                                   args=("one", lock1, event1))
        thread1.start()

        # wait for this thread to acquire the lock
        with self.condition:
            if not self.active_thread:
                self.condition.wait(5)
                eq_(self.active_thread, "one")

        client2 = self._get_client()
        client2.start()

        lock2 = client2.Lock(self.lockpath, "two")
        thread2 = self.make_thread(target=self._thread_lock_acquire_til_event,
                                   args=("two", lock2, self.make_event()))

        # wait until lock1 is released
        event1.set()
        wait = self.make_wait()
        wait(lambda: not lock1.is_acquired)

        # start lock2 acquisition
        thread2.start()
        try:
            # But, delete lock2's parent BEFORE lock2 node is created
            self.client.delete(self.lockpath)
        except NoNodeError:
            # lock2.acquire fails
            pass

        thread1.join()
        thread2.join()
        client2.stop()

could consider catching NoNodeError here and raising ForceRetry?

node = self.client.create(self.create_path, self.data,

I might be totally off tho.

@teeeg
Copy link

teeeg commented Jun 7, 2019

Same race-condition referenced in #329

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants