Zookeeper pod keeps crashing when scaling down and up #513

hoyhbx · 2022-11-18T20:28:01Z

Description

(Describe the feature, bug, question, proposal that you are requesting)
We found that zookeeper pod keeps crashing after a scale-down workload and a scale-up workload.
We found the crash is because the new pods are reusing the undeleted PVC during the scale-down process. However, this problem still persists even when we specify the reclaimPolicy: Delete.

The root cause of this issue is because there is no guard for scale-up before finishing deleting the PVC. We found that zookeeper-operator checks if the upgrade fails before updating the statefulSet, but this check misses checking for the undeleted OrphanPVCs. When reusing the old PVC, the startup script mistakenly thinks the membership list on the old PVC is the up-to-date one and skips updating it. When ZooKeeper starts, it crashes due to inconsistent membership list.

The orphaned PVCs will never get deleted because the operator always waits for the number of ready replicas to equal to desired replicas, which will never happen since one pod keeps crashing.

Importance

(Indicate the importance of this issue to you (blocker, must-have, should-have, nice-to-have))
Importance: must-have

Location

(Where is the piece of code, package, or document affected by this issue?)
The PVC cleanup code is here:

zookeeper-operator/controllers/zookeepercluster_controller.go

Line 707 in c29d475

    
           func (r *ZookeeperClusterReconciler) cleanupOrphanPVCs(instance *zookeeperv1beta1.ZookeeperCluster) (err error) {

The logic to upgrade the statefulSet is here:

zookeeper-operator/controllers/zookeepercluster_controller.go

Line 264 in c29d475

    
           func (r *ZookeeperClusterReconciler) upgradeStatefulSet(instance *zookeeperv1beta1.ZookeeperCluster, foundSts *appsv1.StatefulSet) (err error) {

Suggestions for an improvement

And we think that the operator should check if all OrphanPVCs are deleted before proceeding to the next upgrade.
We suggest to add a check to check if there is no Orphaned PVCs before updating the statefulSet

(How do you suggest to fix or proceed with this issue?)

The text was updated successfully, but these errors were encountered:

hoyhbx · 2023-04-07T20:11:05Z

This PR (#526) can fix this issue.
An alternative (and simpler) fix is to fix the start up script in the zookeeper container, so that the membership list is updated on start. The root cause of the crashing zookeeper pods when reusing the PVC is because the membership list file on the old PVC does not contain the new node.

hoyhbx · 2023-04-10T19:59:35Z

@anishakj Do you mind to take a look at this?
The root cause of this bug is because the restarted zookeeper node has inconsistent membership list.
In the zookeeper teardown script, the ID of the node is removed from the membership list from the dynamic configuration file which exist on all PVCs (

zookeeper-operator/docker/bin/zookeeperTeardown.sh

Line 45 in 72bea54

    
           java -Dlog4j.configuration=file:"$LOG4J_CONF" -jar /opt/libs/zu.jar remove $ZKURL $MYID

).
But in the zookeeper start up script, the script does not update the dynamic configuration file (

zookeeper-operator/docker/bin/zookeeperStart.sh

Lines 131 to 137 in 72bea54

    
           if [[ "$REGISTER_NODE" == true ]]; then 
        
               ROLE=observer 
        
               ZKURL=$(zkConnectionString) 
        
               ZKCONFIG=$(zkConfig) 
        
               set -e 
        
               echo Registering node and writing local configuration to disk. 
        
               java -Dlog4j.configuration=file:"$LOG4J_CONF" -jar /opt/libs/zu.jar add $ZKURL $MYID  $ZKCONFIG $DYNCONFIG

) if the myid and the dynamic config file already exist (this happens if the PVC is reused) (

zookeeper-operator/docker/bin/zookeeperStart.sh

Lines 110 to 114 in 72bea54

    
           if [[ "$ONDISK_MYID_CONFIG" == false || "$ONDISK_DYN_CONFIG" == false ]]; then 
        
             REGISTER_NODE=true 
        
           else 
        
             REGISTER_NODE=false 
        
           fi

).

This causes problem when the new zookeeper node reuses an old PVC. During bootup, if the myid file and dynamic config file exist, the startup script skips updating the membership list, assuming it is correct. But in reality, this node was removed from the membership list from the dynamic config file during teardown.
This causes the newly started node to keep crashing with the following exception:

│ 2023-04-10 19:34:44,914 [myid:2] - INFO  [main:AbstractConnector@381] - Stopped ServerConnector@5c30a9b0{HTTP/1.1, (http/1.1)}{0.0.0.0:7000}                                                            │
│ 2023-04-10 19:34:44,915 [myid:2] - INFO  [main:ContextHandler@1154] - Stopped o.e.j.s.ServletContextHandler@9d5509a{/,null,STOPPED}                                                                     │
│ 2023-04-10 19:34:44,915 [myid:2] - ERROR [main:QuorumPeerMain@114] - Unexpected exception, exiting abnormally                                                                                           │
│ java.lang.RuntimeException: My id 2 not in the peer list                                                                                                                                                │
│     at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:1128)                                                                                                                        │
│     at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:229)                                                                                                         │
│     at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:137)                                                                                                      │
│     at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:91)                                                                                                                   │
│ 2023-04-10 19:34:44,916 [myid:2] - INFO  [main:ZKAuditProvider@42] - ZooKeeper audit is disabled.                                                                                                       │
│ 2023-04-10 19:34:44,917 [myid:2] - ERROR [main:ServiceUtils@42] - Exiting JVM with code 1

iampranabroy · 2023-11-09T04:21:22Z

Hey @hoyhbx @anishakj - Any plans to merge the changes for future releases?

anishakj · 2023-12-14T08:41:53Z

@hoyhbx Could you please increase the coverage?

hoyhbx linked a pull request Feb 5, 2023 that will close this issue

Check orphan PVC before updating statefulSet #526

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zookeeper pod keeps crashing when scaling down and up #513

Zookeeper pod keeps crashing when scaling down and up #513

hoyhbx commented Nov 18, 2022 •

edited

Loading

hoyhbx commented Apr 7, 2023

hoyhbx commented Apr 10, 2023

iampranabroy commented Nov 9, 2023

anishakj commented Dec 14, 2023

Zookeeper pod keeps crashing when scaling down and up #513

Zookeeper pod keeps crashing when scaling down and up #513

Comments

hoyhbx commented Nov 18, 2022 • edited Loading

Description

Importance

Location

Suggestions for an improvement

hoyhbx commented Apr 7, 2023

hoyhbx commented Apr 10, 2023

iampranabroy commented Nov 9, 2023

anishakj commented Dec 14, 2023

hoyhbx commented Nov 18, 2022 •

edited

Loading