Skip to content

Commit

Permalink
SOLR-17405: allow a single thread to reestablish ZK session (#2914)
Browse files Browse the repository at this point in the history
  • Loading branch information
psalagnac authored Dec 20, 2024
1 parent bb75463 commit 9257091
Show file tree
Hide file tree
Showing 3 changed files with 17 additions and 3 deletions.
3 changes: 2 additions & 1 deletion solr/CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@ Optimizations

Bug Fixes
---------------------
(No changes)
* SOLR-17405: Fix race condition where Zookeeper session could be re-established by multiple threads concurrently in
case of frequent session expirations. (Pierre Salagnac)

Dependency Upgrades
---------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@ public void process(WatchedEvent event) {
connectionStrategy.reconnect(
zkServerAddress,
client.getZkClientTimeout(),
this,
client.wrapWatcher(this),
new ZkClientConnectionStrategy.ZkUpdate() {
@Override
public void update(ZooKeeper keeper) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,7 @@ private SolrZkClient(
} catch (InterruptedException e1) {
Thread.currentThread().interrupt();
}
zkCallbackExecutor.shutdown();
zkConnManagerCallbackExecutor.shutdown();
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, e);
}
Expand Down Expand Up @@ -1077,7 +1078,19 @@ private final class ProcessWatchWithExecutor implements Watcher { // see below f
public void process(final WatchedEvent event) {
log.debug("Submitting job to respond to event {}", event);
try {
if (watcher instanceof ConnectionManager) {
// We want all the code that re-creates the Zookeeper session and then invoke
// ZkController.onReconnect() to never be executed by two threads concurrently.
// Pool 'zkConnManagerCallbackExecutor' is single threaded. We make sure such events
// are processed only by this pool. Consequently, in case of a session expiration, we
// don't try to re-create a new session until the previous call to onReconnect()
// returned.
//
// All other events goes to pool 'zkCallbackExecutor', which is unbounded and may
// spawn as many threads as there are events to process.
// This includes event on ConnectionManager others than session expiration. Consequently,
// there is no deadlock when the thread currently reestablishing the session waits for
// the 'SyncConnected' event.
if (watcher instanceof ConnectionManager && event.getState() == Event.KeeperState.Expired) {
zkConnManagerCallbackExecutor.execute(() -> watcher.process(event));
} else {
zkCallbackExecutor.execute(
Expand Down

0 comments on commit 9257091

Please sign in to comment.