Volume without a target node after 1 node down in a 3 node cluster #1724

veenadong · 2024-08-20T18:45:09Z

2.7.0, take 1 node down from a 3 node cluster:

core@glop-nm-126-mem2:~$ kubectl mayastor get volume cc36aef4-0ac9-459b-9e56-c571d5ba2c80 -o yaml
spec:
  num_replicas: 2
  size: 32212254720
  status: Created
  uuid: cc36aef4-0ac9-459b-9e56-c571d5ba2c80
  topology:
    node_topology: !labelled
      exclusion: {}
      inclusion: {}
      affinitykey: []
    pool_topology: !labelled
      exclusion: {}
      inclusion:
        openebs.io/created-by: operator-diskpool
      affinitykey: []
  policy:
    self_heal: true
  thin: true
  num_snapshots: 0
state:
  size: 32212254720
  status: Online
  uuid: cc36aef4-0ac9-459b-9e56-c571d5ba2c80
  replica_topology:
    c51a4d5d-c3b7-48cf-8674-07e7f51321fe:
      node: glop-nm-126-mem3.glcpdev.cloud.hpe.com
      pool: glop-nm-126-mem3.glcpdev.cloud.hpe.com-disk
      state: Online
      usage:
        capacity: 32212254720
        allocated: 926941184
        allocated_snapshots: 0
        allocated_all_snapshots: 0
    85e49541-ff6b-404c-b0aa-eb0e747b1a48:
      node: glop-nm-126-mem1.glcpdev.cloud.hpe.com
      pool: glop-nm-126-mem1.glcpdev.cloud.hpe.com-disk
      state: Unknown
  usage:
    capacity: 32212254720
    allocated: 926941184
    allocated_replica: 926941184
    allocated_snapshots: 0
    allocated_all_snapshots: 0
    total_allocated: 926941184
    total_allocated_replicas: 926941184
    total_allocated_snapshots: 0

Pods are not able to attach the volume:

  Warning  FailedMount  4m12s (x52 over 94m)  kubelet  MountVolume.MountDevice failed for volume "pvc-9a3c606b-9ca2-4438-a3ad-1a07138b6b95" : rpc error: code = Internal desc = Failed to stage volume 9a3c606b-9ca2-4438-a3ad-1a07138b6b95: attach failed: IO error: Input/output error (os error 5), args: hostnqn=nqn.2019-05.io.openebs:node-name:glop-nm-126-mem2.glcpdev.cloud.hpe.com,hostid=42164807-edfc-e94a-3af3-29184e3733b2,nqn=nqn.2019-05.io.openebs:9a3c606b-9ca2-4438-a3ad-1a07138b6b95,transport=tcp,traddr=10.245.244.129,trsvcid=8420,reconnect_delay=10,ctrl_loss_tmo=1980,nr_io_queues=2

Attached is the system dump (note: the logs collection failed using the plugin, so capture the logs using a different method).
mayastor.log.gz
mayastor-2024-08-20--18-05-02-UTC.tar.gz

The text was updated successfully, but these errors were encountered:

dcaputo-harmoni · 2024-09-30T03:30:17Z

I am seeing this same error (attach failed: IO error: Input/output error (os error 5)) after taking one node down and bringing it back up. When running kubectl-mayastor get volumes it lists the volumes that had that target note as having a target node of <none> and accessibility of <none> as well, but a status of online.

Just to provide some further details here, it looks like for the volumes that went down as a result of the node going down, the frontend/host_acl node differs from the target node, whereas for the volumes that remained working, this was the same.

{
  "uuid": "8388d455-d250-4706-a16b-55dfa6ef8327",
  "size": 8589934592,
  "labels": null,
  "num_replicas": 2,
  "status": {
    "Created": "Online"
  },
  "policy": {
    "self_heal": true
  },
  "topology": {
    "node": {
      "Labelled": {
        "exclusion": {},
        "inclusion": {}
      }
    },
    "pool": {
      "Labelled": {
        "exclusion": {},
        "inclusion": {
          "openebs.io/created-by": "operator-diskpool"
        }
      }
    }
  },
  "last_nexus_id": null,
  "operation": null,
  "thin": true,
  "target": {
    "node": "aks-storage-93614762-vmss000002",
    "nexus": "9e0bbe18-9b2a-4aaf-ac71-10edf18a044d",
    "protocol": "nvmf",
    "active": true,
    "config": {
      "controllerIdRange": {
        "start": 5,
        "end": 6
      },
      "reservationKey": 12425731461037558000,
      "reservationType": "ExclusiveAccess",
      "preemptPolicy": "Holder"
    },
    "frontend": {
      "host_acl": [
        {
          "node_name": "aks-storage-93614762-vmss000001",
          "node_nqn": "nqn.2019-05.io.openebs:node-name:aks-storage-93614762-vmss000001"
        }
      ]
    }
  },
  "publish_context": {},
  "affinity_group": null
}

tiagolobocastro · 2024-09-30T09:31:36Z

Seems we had missed the first one @veenadong, sorry about that.

@dcaputo-harmoni could you please share the volume attachments which reference this volume (if any) and also a support bundle?

dcaputo-harmoni · 2024-09-30T16:00:37Z

@tiagolobocastro Unfortunately I had to kill and restore the cluster right after this happened, and didn't get a chance to export the data you are looking for before I did. If it happens again I'll provide these details, thanks. I was running mayastor 2.7.0 and just upgraded to 2.7.1 when it rebuilt, and I know there are some stability improvements in there so am wondering if that might help.

tiagolobocastro · 2024-10-01T14:18:46Z

No problem, I guess keep an eye on it and should it happen again please let us know.
Without logs hard to say if any stability fix would help here. Could be a data-plane bug or could be some simple miscommunication between the control-plane and csi, causing the volume to not be published.

tiagolobocastro · 2024-10-03T17:23:16Z

I think this bug: #1747 (or a variation of this) explains what happens here. CSI and control-plane get out of sync, and the volume ends up not staying published and csi-node keeps trying to connect to a target which is not there.
A few things we can improve here:

fix what caused the out of sync between csi and controller
if csi-node can't connect to the volume target, it should check if the subsystem is listening and report this information. If it stays this way it surely means the volume target is not being created.

@Abhinandan-Purkait @dsharma-dc any other thoughts here?

dsharma-dc · 2024-11-25T05:53:42Z

This looks like the sequence of events that happened.

old node = glop-nm-126-mem1.glcpdev.cloud.hpe.com
new node = glop-nm-126-mem3.glcpdev.cloud.hpe.com

17:04:58 - Unpublish of the volume triggered as a result of node shutdown. - failed. (503 Service Unavailable)
17:05:06 - Publish volume triggered as app moves to new node
17:05:08 - Unpublish of the volume attempted again - failed (503 Service Unavailable)
17:05:19 - Unpublish hasn't still succeeded as old node is down but spec is cleared of old target
17:05:21 - New target got created on new node as publish proceeded.

 [pod/mayastor-agent-core-88bc8d8b9-k7dcs/agent-core] 2024-08-20T17:05:21.330513Z INFO core::controller::resources::operations_helper: complete_create, val: Nexus { node: NodeId("glop-nm-126-mem3.glcpdev.cloud.hpe.com"), name: "cc36aef4-0ac9-459b-9e56-c571d5ba2c80", uuid: NexusId(c970c537-ec9f-46f7-bf6b-7997bad0e421, "c970c537-ec9f-46f7-bf6b-7997bad0e421"), size: 32212254720, status: Online, children: [Child { uri: ChildUri("bdev:///c51a4d5d-c3b7-48cf-8674-07e7f51321fe?uuid=c51a4d5d-c3b7-48cf-8674-07e7f51321fe"), state: Online, rebuild_progress: None, state_reason: Unknown, faulted_at: None, has_io_log: Some(false) }], device_uri: "", rebuilds: 0, share: None, allowed_hosts: [] }

[pod/mayastor-agent-core-88bc8d8b9-k7dcs/agent-core] at control-plane/agents/src/bin/core/controller/resources/operations_helper.rs:168
[pod/mayastor-agent-core-88bc8d8b9-k7dcs/agent-core] in core::volume::service::publish_volume with request: PublishVolume { uuid: VolumeId(cc36aef4-0ac9-459b-9e56-c571d5ba2c80, "cc36aef4-0ac9-459b-9e56-c571d5ba2c80"), target_node: Some(NodeId("glop-nm-126-mem3.glcpdev.cloud.hpe.com")), share: Some(Nvmf), publish_context: {"ioTimeout": "30"}, frontend_nodes: ["glop-nm-126-mem3.glcpdev.cloud.hpe.com"] }, volume.uuid: cc36aef4-0ac9-459b-9e56-c571d5ba2c80

17:05:22 - A retry of failing unpublish happened and deleted the new target as the spec referenced new one.
17:05:33 - nvme connect as part of volume staging fails since target is deleted.

cwiggs · 2024-12-28T23:14:19Z

Seems we had missed the first one @veenadong, sorry about that.

@dcaputo-harmoni could you please share the volume attachments which reference this volume (if any) and also a support bundle?

I'm running into this issue and am generating the support bundle now. I'm not sure if it has sensitive info in the bundle so I'd prefer not to post it publicly, is there somewhere I can send it that is private?

cwiggs · 2024-12-28T23:22:30Z

To expand on this I noticed that 1/3 of my k3s worker nodes was down so I restarted it. After that is when I started seeing the OpenEBS issue, although I can't say for sure it's related.

I can also upload the "volume attachments", but I'm not 100% sure what those are? The PV that is failing to attach was created dynamically via a PVC. The PVC is then mounted to a deployment, I can upload any of those manifests if that helps.

cwiggs · 2024-12-28T23:26:37Z

Looks like I was able to work around this by scaling the replica down to 0 and then back to 1. Volume mounts successfully now.

I'd still like to send the support bundle, let me know where I can send it.

tiagolobocastro · 2024-12-29T23:44:39Z

You can send it to [email protected]

cwiggs · 2024-12-30T20:27:19Z

You can send it to [email protected]

I sent sent the tar file via an email from my cwiggs.com domain. Let me know if there is anything else that will help with the issue.

Thanks!

mardep123 · 2025-01-08T12:48:27Z

Hi!
I'm also running into the same issue. I tried to make some changes to the cluster and therefor restarted one node at a time. I waited for the node to go from degraded to online. It worked on most of the nodes, but one of them did not go to online, but rather got target , accessibility . Not sure if it is related, next time it happens I can generate a support bundle.

tiagolobocastro · 2025-01-08T13:07:46Z

You can send it to [email protected]

I sent sent the tar file via an email from my cwiggs.com domain. Let me know if there is anything else that will help with the issue.

Thanks!

We haven't received the email.

Not sure if it is related, next time it happens I can generate a support bundle.

That would help, thank you

martinfjohansen · 2025-01-09T09:18:14Z

Looks like I was able to work around this by scaling the replica down to 0 and then back to 1. Volume mounts successfully now.

@cwiggs How did you do this? What commands did you run?

@tiagolobocastro Is there any way to get the volumes back when/if this happens?

cwiggs · 2025-01-09T14:37:46Z

We haven't received the email.

I keep getting a response from googlegroups.com that they weren't able to deliver the email since it has an attachment. I just sent it a 3rd time using Google Drive and so far it seems it went through.

@cwiggs How did you do this? What commands did you run?
The workload I have is a deployment, I use k9s to scale it to 0 using s and then back up to 1 with s. I believe there is also a kubectl scale command you can use.

@tiagolobocastro Is there any way to get the volumes back when/if this happens?
IME the volume isn't gone, it just isn't able to attach to the pod properly. It seems to me that something isn't properly detaching the volume from the previous pod when you restart the deployment, but if you scale to 0, and back up it unmount/mounts properly.

martinfjohansen · 2025-01-09T14:45:41Z

@cwiggs, so what you did was to scale the OpenEBS deployment itself down to 0 and then up to 3?

cwiggs · 2025-01-09T22:57:59Z

@cwiggs, so what you did was to scale the OpenEBS deployment itself down to 0 and then up to 3?

No just the deployment that is using OpenEBS for the PV and throwing this error.

tiagolobocastro · 2025-01-10T09:52:49Z

I've requested access to the google drive @cwiggs

nneram · 2025-01-10T11:07:41Z

Hello,

We also encounter this issue after a node reboot in our HA cluster with 3 nodes. Specifically, the node that restarts experiences a MountVolume failure with the following error:

Name:             alertmanager-pgl-alertmanager-1
Namespace:        dome
Priority:         0
Service Account:  pgl-alertmanager
Node:             b02696-01srv/10.50.254.132
Start Time:       Thu, 09 Jan 2025 18:39:35 +0100
...
Events:
  Type     Reason       Age                   From     Message
  ----     ------       ----                  ----     -------
  Warning  FailedMount  3m6s (x512 over 17h)  kubelet  MountVolume.MountDevice failed for volume "pvc-3bfbec51-af9b-4ce2-90fa-06bec4b6075d" : rpc error: code = Internal desc = Failed to stage volume 3bfbec51-af9b-4ce2-90fa-06bec4b6075d: attach failed: IO error: Input/output error (os error 5), args: hostnqn=nqn.2019-05.io.openebs:node-name:b02696-01srv,hostid=33373150-3234-4e43-3630-333030463347,nqn=nqn.2019-05.io.openebs:3bfbec51-af9b-4ce2-90fa-06bec4b6075d,transport=tcp,traddr=10.50.254.132,trsvcid=8420,reconnect_delay=10,ctrl_loss_tmo=1980,nr_io_queues=2

The error logs from the csi-node show consistent failures:

csi-node   2025-01-10T09:49:28.468832Z ERROR csi_node::node: Failed to stage volume 3bfbec51-af9b-4ce2-90fa-06bec4b6075d: attach failed: IO error: Input/output error (os error 5), args: hostnqn=nqn.2019-05.io.openebs:node-name:b02696-01srv,hostid=33373150-3234-4e43-3630-333030463347,nqn=nqn.2019-05.io.openebs:3bfbec51-af9b-4ce2-90fa-06bec4b6075d,transport=tcp,traddr=10.50.254.132,trsvcid=8420,reconnect_delay=10,ctrl_loss_tmo=1980,nr_io_queues=2
csi-node     at control-plane/csi-driver/src/bin/node/node.rs:717
csi-node 
csi-node   2025-01-10T09:51:31.168755Z ERROR csi_node::node: Failed to stage volume 3bfbec51-af9b-4ce2-90fa-06bec4b6075d: attach failed: IO error: Input/output error (os error 5), args: hostnqn=nqn.2019-05.io.openebs:node-name:b02696-01srv,hostid=33373150-3234-4e43-3630-333030463347,nqn=nqn.2019-05.io.openebs:3bfbec51-af9b-4ce2-90fa-06bec4b6075d,transport=tcp,traddr=10.50.254.132,trsvcid=8420,reconnect_delay=10,ctrl_loss_tmo=1980,nr_io_queues=2
csi-node     at control-plane/csi-driver/src/bin/node/node.rs:717
...

This issue appears to affect stateful sets, particularly with the alertmanager-pgl-alertmanager in our setup. We've found that restarting the pod resolves the error, but we're looking for a more permanent solution.

I'm willing to provide a support bundle to, but there's an issue with log collection because I have an OAuth2 proxy in front of Loki, and I haven't found a way to pass a token to kubectl mayastor dump system. If anyone knows how to handle this authentication issue it would be great ! :) . In the meantime, I'll try to work around this and send the bundle to you.

tiagolobocastro · 2025-01-10T11:14:13Z

Which version is this @nneram ? We've fixed a few issues on 2.7.2 that could be related to this. Though would require scaling application back down to 0 and then back up.
A support bundle would help.
Let's try without loki, the plugin now collects the k8s logs as well, might be enough.

nneram · 2025-01-10T13:16:36Z

I'm using Mayastor version 2.7.1 with the openebs chart version 4.1.1. My kubectl-mayastor plugin is at revision 399c96472dc3 (v2.7.2+0). I'll also try to email the bundle since I'm not comfortable sharing it here.

tiagolobocastro · 2025-01-12T23:31:09Z

Both are having the same/similar issue.
Volume seems fine, but nodeStage is failing.
Please also send across kernel logs to see if we can find some more info from there.

avishnu assigned dsharma-dc Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Volume without a target node after 1 node down in a 3 node cluster #1724

Volume without a target node after 1 node down in a 3 node cluster #1724

veenadong commented Aug 20, 2024

dcaputo-harmoni commented Sep 30, 2024 •

edited

Loading

tiagolobocastro commented Sep 30, 2024

dcaputo-harmoni commented Sep 30, 2024

tiagolobocastro commented Oct 1, 2024

tiagolobocastro commented Oct 3, 2024 •

edited

Loading

dsharma-dc commented Nov 25, 2024

cwiggs commented Dec 28, 2024

cwiggs commented Dec 28, 2024

cwiggs commented Dec 28, 2024

tiagolobocastro commented Dec 29, 2024

cwiggs commented Dec 30, 2024

mardep123 commented Jan 8, 2025

tiagolobocastro commented Jan 8, 2025

martinfjohansen commented Jan 9, 2025

cwiggs commented Jan 9, 2025

martinfjohansen commented Jan 9, 2025

cwiggs commented Jan 9, 2025

tiagolobocastro commented Jan 10, 2025

nneram commented Jan 10, 2025

tiagolobocastro commented Jan 10, 2025

nneram commented Jan 10, 2025

tiagolobocastro commented Jan 12, 2025

Volume without a target node after 1 node down in a 3 node cluster #1724

Volume without a target node after 1 node down in a 3 node cluster #1724

Comments

veenadong commented Aug 20, 2024

dcaputo-harmoni commented Sep 30, 2024 • edited Loading

tiagolobocastro commented Sep 30, 2024

dcaputo-harmoni commented Sep 30, 2024

tiagolobocastro commented Oct 1, 2024

tiagolobocastro commented Oct 3, 2024 • edited Loading

dsharma-dc commented Nov 25, 2024

cwiggs commented Dec 28, 2024

cwiggs commented Dec 28, 2024

cwiggs commented Dec 28, 2024

tiagolobocastro commented Dec 29, 2024

cwiggs commented Dec 30, 2024

mardep123 commented Jan 8, 2025

tiagolobocastro commented Jan 8, 2025

martinfjohansen commented Jan 9, 2025

cwiggs commented Jan 9, 2025

martinfjohansen commented Jan 9, 2025

cwiggs commented Jan 9, 2025

tiagolobocastro commented Jan 10, 2025

nneram commented Jan 10, 2025

tiagolobocastro commented Jan 10, 2025

nneram commented Jan 10, 2025

tiagolobocastro commented Jan 12, 2025

dcaputo-harmoni commented Sep 30, 2024 •

edited

Loading

tiagolobocastro commented Oct 3, 2024 •

edited

Loading