Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PVC Creation stuck with v3.13.0 #5073

Closed
appcoders opened this issue Jan 10, 2025 · 11 comments
Closed

PVC Creation stuck with v3.13.0 #5073

appcoders opened this issue Jan 10, 2025 · 11 comments
Labels
component/deployment Helm chart, kubernetes templates and configuration Issues/PRs component/rbd Issues related to RBD

Comments

@appcoders
Copy link

I am trying to get ceph-csi running for 3 days now. I am using k3s v1.31.4+k3s1 and ceph v3.13.0.
I'm following https://docs.ceph.com/en/latest/rbd/rbd-kubernetes/
I replaced canary with v3.13.0

All pods come up, I can reach the cluster without issues from the pod.

❯ kubectl exec -it csi-rbdplugin-provisioner-869fd747-46qf5 -c csi-rbdplugin -- /bin/sh
sh-5.1# ceph -s --id=kubernetes --key=***REDACTED**** -m 192.168.72.1
  cluster:
    id:     5f2e2b61-4b0c-4e90-aff7-281147b312a8
    health: HEALTH_OK

  services:
    mon: 5 daemons, quorum fpha01,fpha02,fpha03,fpha06,fpha08 (age 3d)
    mgr: fpha01(active, since 6d), standbys: fpha08, fpha06
    mds: 1/1 daemons up, 2 standby
    osd: 20 osds: 20 up (since 3d), 20 in (since 3d)

  data:
    volumes: 1/1 healthy
    pools:   5 pools, 225 pgs
    objects: 2.09M objects, 7.6 TiB
    usage:   23 TiB used, 47 TiB / 70 TiB avail
    pgs:     225 active+clean

  io:
    client:   6.0 KiB/s rd, 8.9 MiB/s wr, 1 op/s rd, 118 op/s wr

There are no stale rbd commands on the nodes/pods. No special output or errors on dmesg on the nodes.
It looks that simply nothing happens after " setting image options ":

csi-rbdplugin-provisioner-869fd747-46qf5 csi-rbdplugin I0110 18:11:31.932145       1 utils.go:266] ID: 23 Req-ID: pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c GRPC call: /csi.v1.Controller/CreateVolume
csi-rbdplugin-provisioner-869fd747-46qf5 csi-rbdplugin I0110 18:11:31.932320       1 utils.go:267] ID: 23 Req-ID: pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c GRPC request: {"capacity_range":{"required_bytes":1073741824},"name":"pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c","parameters":{"clusterID":"5f2e2b61-4b0c-4e90-aff7-281147b312a8","csi.storage.k8s.io/pv/name":"pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c","csi.storage.k8s.io/pvc/name":"raw-block-pvc5","csi.storage.k8s.io/pvc/namespace":"ceph-csi-rbd","imageFeatures":"layering","pool":"kubernetes"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Block":{}},"access_mode":{"mode":1}}]}
csi-rbdplugin-provisioner-869fd747-46qf5 csi-rbdplugin I0110 18:11:31.932455       1 rbd_util.go:1387] ID: 23 Req-ID: pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c setting disableInUseChecks: false image features: [layering] mounter: rbd
csi-rbdplugin-provisioner-869fd747-46qf5 csi-rbdplugin I0110 18:11:31.945711       1 omap.go:89] ID: 23 Req-ID: pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c got omap values: (pool="kubernetes", namespace="", name="csi.volumes.default"): map[]
csi-rbdplugin-provisioner-869fd747-46qf5 csi-rbdplugin I0110 18:11:31.949091       1 omap.go:159] ID: 23 Req-ID: pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c set omap keys (pool="kubernetes", namespace="", name="csi.volumes.default"): map[csi.volume.pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c:fdf82e72-0193-4ef9-9edb-c76ca7d72135])
csi-rbdplugin-provisioner-869fd747-46qf5 csi-rbdplugin I0110 18:11:31.950313       1 omap.go:159] ID: 23 Req-ID: pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c set omap keys (pool="kubernetes", namespace="", name="csi.volume.fdf82e72-0193-4ef9-9edb-c76ca7d72135"): map[csi.imagename:csi-vol-fdf82e72-0193-4ef9-9edb-c76ca7d72135 csi.volname:pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c csi.volume.owner:ceph-csi-rbd])
csi-rbdplugin-provisioner-869fd747-46qf5 csi-rbdplugin I0110 18:11:31.950329       1 rbd_journal.go:515] ID: 23 Req-ID: pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c generated Volume ID (0001-0024-5f2e2b61-4b0c-4e90-aff7-281147b312a8-0000000000000009-fdf82e72-0193-4ef9-9edb-c76ca7d72135) and image name (csi-vol-fdf82e72-0193-4ef9-9edb-c76ca7d72135) for request name (pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c)
csi-rbdplugin-provisioner-869fd747-46qf5 csi-rbdplugin I0110 18:11:31.950365       1 rbd_util.go:437] ID: 23 Req-ID: pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c rbd: create kubernetes/csi-vol-fdf82e72-0193-4ef9-9edb-c76ca7d72135 size 1024M (features: [layering]) using mon 192.168.72.1:6789,192.168.72.2:6789,192.168.72.3:6789,192.168.72.6:6789,192.168.72.8:6789
csi-rbdplugin-provisioner-869fd747-46qf5 csi-rbdplugin I0110 18:11:31.950379       1 rbd_util.go:1641] ID: 23 Req-ID: pvc-99e9948e-bbc5-4ade-be96-90b92cdd0e4c setting image options on kubernetes/csi-vol-fdf82e72-0193-4ef9-9edb-c76ca7d72135
csi-rbdplugin-provisioner-869fd747-46qf5 csi-provisioner I0110 18:11:31.930207       1 event.go:389] "Event occurred" object="ceph-csi-rbd/raw-block-pvc5" fieldPath="" kind="PersistentVolumeClaim" apiVersion="v1" type="Normal" reason="Provisioning" message="External provisioner is provisioning volume for claim \"ceph-csi-rbd/raw-block-pvc5\""

Install is done this way in namespace ceph-csi-rbd:

kubectl create -f csidriver.yaml
kubectl create -f ceph-config-map.yaml
kubectl create -f csi-config-map.yaml
kubectl create -f csi-kms-config-map.yaml
kubectl create -f csi-rbd-secret.yaml
kubectl create -f csi-nodeplugin-rbac.yaml
kubectl create -f csi-provisioner-rbac.yaml
kubectl create -f csi-rbd-sc.yaml
kubectl create -f csi-rbdplugin-provisioner.yaml
kubectl create -f csi-rbdplugin.yaml

And after everything is up:

kubectl create -f raw-block-pvc.yaml

All yamls and logfiles attached.

log.txt
csidriver.yaml.txt
ceph-config-map.yaml.txt
csi-config-map.yaml.txt
csi-kms-config-map.yaml.txt
csi-rbd-secret.yaml.txt
csi-nodeplugin-rbac.yaml.txt
csi-provisioner-rbac.yaml.txt
csi-rbd-sc.yaml.txt
csi-rbdplugin-provisioner.yaml.txt
csi-rbdplugin.yaml.txt
raw-block-pvc.yaml.txt

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jan 15, 2025

@appcoders did you try to create an rbd image from the rbd provisioned pod using the same ceph user specified in the secret? if that works please share the output and also the csi-rbdplugin container logs from the provisioner pod.

@nixpanic nixpanic added component/rbd Issues related to RBD component/deployment Helm chart, kubernetes templates and configuration Issues/PRs labels Jan 15, 2025
@alex-ioma
Copy link

Hi @appcoders, I haven't had time to take look to all the manifests, however, I do see something potentially strange in the logs. You are using a .1 address as a monitor. Do you have a ceph public network larger than /24 or do you have the gateway at a non standard address?

If not and your network is 192.168.72.0/24 with gw 192.168.72.1, you might need to review your ip assignment.

But again, this is just a very fast look at this issue.

Hope this helps.

@appcoders
Copy link
Author

Hi @appcoders, I haven't had time to take look to all the manifests, however, I do see something potentially strange in the logs. You are using a .1 address as a monitor. Do you have a ceph public network larger than /24 or do you have the gateway at a non standard address?

If not and your network is 192.168.72.0/24 with gw 192.168.72.1, you might need to review your ip assignment.

But again, this is just a very fast look at this issue.

Hope this helps.

Hi @alex-ioma,
the proxmox machines have 3 network cards, 1 external traffic with 1GBit and 2 interfaces with 10GBit, one for ceph (192.168.72.0/24) and one for proxmox internal traffic (192.168.71.0/24). So there is no need for a gateway as the VMs have second network card bound to a bridged interface with ceph. And I can reach the ceph cluster from the pod. But thanks for your thoughts and time.

@alex-ioma
Copy link

Hi @alex-ioma, the proxmox machines have 3 network cards, 1 external traffic with 1GBit and 2 interfaces with 10GBit, one for ceph (192.168.72.0/24) and one for proxmox internal traffic (192.168.71.0/24). So there is no need for a gateway as the VMs have second network card bound to a bridged interface with ceph. And I can reach the ceph cluster from the pod. But thanks for your thoughts and time.

Gotcha. In this case this is not an issue. And 10GB is a must for this kind of setup - very similar to what I have.
In any case, I would still not use a .1 as a node IP (but this is more for a clarity of mind).

@appcoders
Copy link
Author

@appcoders did you try to create an rbd image from the rbd provisioned pod using the same ceph user specified in the secret? if that works please share the output and also the csi-rbdplugin container logs from the provisioner pod.

Hi @Madhu-1

Good point. I created an image directly on the ceph host with rbd and the user from the secret:

rbd create --size 100 kubernetes/image --id=kubernetes --key=**redacted** -m 192.168.72.2

command exits and image is created.
now from the pod I do a

sh-5.1# rbd ls kubernetes --id=kubernetes --key=**redacted** -m 192.168.72.2
image
sh-5.1# rbd ls kubernetes --id=kubernetes --key=**redacted** -m 192.168.72.2:3300
image
sh-5.1# rbd ls kubernetes --id=kubernetes --key=**redacted** -m 192.168.72.2:6789
image

which lists the image fine, regardless which port/protocol is used.

Now trying to create a image with rbd from the pod it stalls forever

sh-5.1# rbd create --size 1024 kubernetes/testimage --id=kubernetes --key=**redacted** -m 192.168.72.2

I captured network traffic on host 192.168.72.2 with first from pod, and second part from other ceph host:

Image

So a network/firewall issue can be ruled out? I don't get it what the problem is at all. How can I debug this further?

@appcoders
Copy link
Author

@Madhu-1
I tried to dig deeper.
In the meantime I upgraded ceph from quincy to reef. Still same behavior.
For better understanding:
Ceph is running on the host cluster (Proxmox). K3s is running as VMs on Proxmox (Qemu). I installed a fresh debian bookworm VM with ceph-common reef tools, then I tried to use rbd create on this VM host. I got the same behavior: ceph -s and rbd ls works but not rbd create.

So I logged debug_mon with 10/10 and got this output for the access from the debiantest VM with rbd create:

2025-01-15T20:33:13.313+0100 761b220006c0 10 mon.fpha03@2(peon) e7 ms_handle_fast_authentication session 0x5d939ad10900 con 0x5d939a2b4000 addr - MonSession(unknown.0  is open , features 0x0 (unknown))
2025-01-15T20:33:13.314+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 ms_handle_accept con 0x5d939a2b4000 session 0x5d939ad10900 registering session for 192.168.72.54:0/1189646554
2025-01-15T20:33:13.315+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 handle_mon_get_map
2025-01-15T20:33:13.315+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 handle_subscribe mon_subscribe({config=0+,monmap=0+}) v3
2025-01-15T20:33:13.315+0100 761b23e006c0 10 mon.fpha03@2(peon).config check_sub next 0 have 23
2025-01-15T20:33:13.315+0100 761b23e006c0 10 mon.fpha03@2(peon).config refresh_config crush_location for remote_host debiantest is {}
2025-01-15T20:33:13.315+0100 761b23e006c0 10 mon.fpha03@2(peon).config maybe_send_config to client.? (changed)
2025-01-15T20:33:13.315+0100 761b23e006c0 10 mon.fpha03@2(peon).config send_config to client.?
2025-01-15T20:33:13.315+0100 761b23e006c0 10 mon.fpha03@2(peon).monmap v7 check_sub monmap next 0 have 7
2025-01-15T20:33:13.315+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 ms_handle_reset 0x5d939a2b4000 192.168.72.54:0/1189646554
2025-01-15T20:33:13.315+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 reset/close on session client.? 192.168.72.54:0/1189646554
2025-01-15T20:33:13.315+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 remove_session 0x5d939ad10900 client.? 192.168.72.54:0/1189646554 features 0x3f01cfbffffdffff

@appcoders
Copy link
Author

@Madhu-1
Increased debug_mon to 20/20. Command

rbd pool stats kubernetes --id=kubernetes --key=**redacted**  -m 192.168.72.3:3300

also stalls.

Log from debiantest node:

2025-01-15T21:08:27.780+0100 761b220006c0 10 mon.fpha03@2(peon) e7 handle_auth_request con 0x5d939a2b5000 (start) method 2 payload 27
2025-01-15T21:08:27.780+0100 761b220006c0 10 mon.fpha03@2(peon).auth v83889 _assign_global_id 588411642 (max 588444096)
2025-01-15T21:08:27.781+0100 761b220006c0 10 mon.fpha03@2(peon) e7 handle_auth_request con 0x5d939a2b5000 (more) method 2 payload 36
2025-01-15T21:08:27.781+0100 761b220006c0 10 mon.fpha03@2(peon) e7 ms_handle_fast_authentication session 0x5d939c042d80 con 0x5d939a2b5000 addr - MonSession(unknown.0  is open , features 0x0 (unknown))
2025-01-15T21:08:27.782+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 ms_handle_accept con 0x5d939a2b5000 session 0x5d939c042d80 registering session for 192.168.72.54:0/1177242017
2025-01-15T21:08:27.783+0100 761b23e006c0 20 mon.fpha03@2(peon) e7 _ms_dispatch existing session 0x5d939c042d80 for client.?
2025-01-15T21:08:27.783+0100 761b23e006c0 20 mon.fpha03@2(peon) e7  entity_name client.kubernetes global_id 588411642 (new_ok) caps profile rbd
2025-01-15T21:08:27.783+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 handle_mon_get_map
2025-01-15T21:08:27.783+0100 761b23e006c0 20 mon.fpha03@2(peon) e7 _ms_dispatch existing session 0x5d939c042d80 for client.?
2025-01-15T21:08:27.783+0100 761b23e006c0 20 mon.fpha03@2(peon) e7  entity_name client.kubernetes global_id 588411642 (new_ok) caps profile rbd
2025-01-15T21:08:27.783+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 handle_subscribe mon_subscribe({config=0+,monmap=0+}) v3
2025-01-15T21:08:27.783+0100 761b23e006c0 10 mon.fpha03@2(peon).config check_sub next 0 have 23
2025-01-15T21:08:27.783+0100 761b23e006c0 10 mon.fpha03@2(peon).config refresh_config crush_location for remote_host debiantest is {}
2025-01-15T21:08:27.783+0100 761b23e006c0 20 mon.fpha03@2(peon).config refresh_config client.kubernetes crush {} device_class
2025-01-15T21:08:27.783+0100 761b23e006c0 10 mon.fpha03@2(peon).config maybe_send_config to client.? (changed)
2025-01-15T21:08:27.783+0100 761b23e006c0 10 mon.fpha03@2(peon).config send_config to client.?
2025-01-15T21:08:27.783+0100 761b23e006c0 10 mon.fpha03@2(peon).monmap v7 check_sub monmap next 0 have 7
2025-01-15T21:08:27.783+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 ms_handle_reset 0x5d939a2b5000 192.168.72.54:0/1177242017
2025-01-15T21:08:27.783+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 reset/close on session client.? 192.168.72.54:0/1177242017
2025-01-15T21:08:27.783+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 remove_session 0x5d939c042d80 client.? 192.168.72.54:0/1177242017 features 0x3f01cfbffffdffff

Same command log from a Proxmox Host that has no ceph osds/mgr/mds and works fine.

2025-01-15T21:08:23.732+0100 761b220006c0 10 mon.fpha03@2(peon) e7 handle_auth_request con 0x5d939a2b4c00 (start) method 2 payload 27
2025-01-15T21:08:23.732+0100 761b220006c0 10 mon.fpha03@2(peon).auth v83889 _assign_global_id 588411612 (max 588444096)
2025-01-15T21:08:23.732+0100 761b220006c0 10 mon.fpha03@2(peon) e7 handle_auth_request con 0x5d939a2b4c00 (more) method 2 payload 36
2025-01-15T21:08:23.732+0100 761b220006c0 10 mon.fpha03@2(peon) e7 ms_handle_fast_authentication session 0x5d939c097d40 con 0x5d939a2b4c00 addr - MonSession(unknown.0  is open , features 0x0 (unknown))
2025-01-15T21:08:23.733+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 ms_handle_accept con 0x5d939a2b4c00 session 0x5d939c097d40 registering session for 192.168.72.12:0/613661562
2025-01-15T21:08:23.733+0100 761b23e006c0 20 mon.fpha03@2(peon) e7 _ms_dispatch existing session 0x5d939c097d40 for client.?
2025-01-15T21:08:23.733+0100 761b23e006c0 20 mon.fpha03@2(peon) e7  entity_name client.kubernetes global_id 588411612 (new_ok) caps profile rbd
2025-01-15T21:08:23.733+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 handle_mon_get_map
2025-01-15T21:08:23.733+0100 761b23e006c0 20 mon.fpha03@2(peon) e7 _ms_dispatch existing session 0x5d939c097d40 for client.?
2025-01-15T21:08:23.733+0100 761b23e006c0 20 mon.fpha03@2(peon) e7  entity_name client.kubernetes global_id 588411612 (new_ok) caps profile rbd
2025-01-15T21:08:23.733+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 handle_subscribe mon_subscribe({config=0+,monmap=0+}) v3
2025-01-15T21:08:23.733+0100 761b23e006c0 10 mon.fpha03@2(peon).config check_sub next 0 have 23
2025-01-15T21:08:23.733+0100 761b23e006c0 10 mon.fpha03@2(peon).config refresh_config crush_location for remote_host fpha12 is {}
2025-01-15T21:08:23.733+0100 761b23e006c0 20 mon.fpha03@2(peon).config refresh_config client.kubernetes crush {} device_class
2025-01-15T21:08:23.733+0100 761b23e006c0 10 mon.fpha03@2(peon).config maybe_send_config to client.? (changed)
2025-01-15T21:08:23.733+0100 761b23e006c0 10 mon.fpha03@2(peon).config send_config to client.?
2025-01-15T21:08:23.733+0100 761b23e006c0 10 mon.fpha03@2(peon).monmap v7 check_sub monmap next 0 have 7
2025-01-15T21:08:23.733+0100 761b23e006c0 20 mon.fpha03@2(peon) e7 _ms_dispatch existing session 0x5d939c097d40 for client.?
2025-01-15T21:08:23.733+0100 761b23e006c0 20 mon.fpha03@2(peon) e7  entity_name client.kubernetes global_id 588411612 (new_ok) caps profile rbd
2025-01-15T21:08:23.733+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 handle_subscribe mon_subscribe({mgrmap=0+}) v3
2025-01-15T21:08:23.733+0100 761b23e006c0 20 is_capable service=mon command= read addr 192.168.72.12:0/613661562 on cap allow profile rbd
2025-01-15T21:08:23.733+0100 761b23e006c0 20  allow so far , doing grant allow profile rbd
2025-01-15T21:08:23.733+0100 761b23e006c0 20  match
2025-01-15T21:08:23.733+0100 761b23e006c0 20 mon.fpha03@2(peon).mgr e185 Sending map to subscriber 0x5d939a2b4c00 192.168.72.12:0/613661562
2025-01-15T21:08:23.733+0100 761b23e006c0 20 mon.fpha03@2(peon) e7 _ms_dispatch existing session 0x5d939c097d40 for client.?
2025-01-15T21:08:23.733+0100 761b23e006c0 20 mon.fpha03@2(peon) e7  entity_name client.kubernetes global_id 588411612 (new_ok) caps profile rbd
2025-01-15T21:08:23.733+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 handle_subscribe mon_subscribe({osdmap=0}) v3
2025-01-15T21:08:23.733+0100 761b23e006c0 20 is_capable service=mon command= read addr 192.168.72.12:0/613661562 on cap allow profile rbd
2025-01-15T21:08:23.733+0100 761b23e006c0 20  allow so far , doing grant allow profile rbd
2025-01-15T21:08:23.733+0100 761b23e006c0 20  match
2025-01-15T21:08:23.733+0100 761b23e006c0 20 is_capable service=osd command= read addr 192.168.72.12:0/613661562 on cap allow profile rbd
2025-01-15T21:08:23.733+0100 761b23e006c0 20  allow so far , doing grant allow profile rbd
2025-01-15T21:08:23.733+0100 761b23e006c0 20  match
2025-01-15T21:08:23.733+0100 761b23e006c0 10 mon.fpha03@2(peon).osd e32954 check_osdmap_sub 0x5d939fcd3140 next 0 (onetime)
2025-01-15T21:08:23.740+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 ms_handle_reset 0x5d939a2b4c00 192.168.72.12:0/613661562
2025-01-15T21:08:23.740+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 reset/close on session client.? 192.168.72.12:0/613661562
2025-01-15T21:08:23.740+0100 761b23e006c0 10 mon.fpha03@2(peon) e7 remove_session 0x5d939c097d40 client.? 192.168.72.12:0/613661562 features 0x3f01cfbffffdffff

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jan 16, 2025

@appcoders can you run the rbd create with --debug-rbd flag

rbd create replicapool/testing1 --size=1G --debug-rbd=20 @idryomov will be able to help.

@idryomov
Copy link
Contributor

The problem is likely with connecting to one (or more) of the OSDs, not with the monitor. It's definitely too early to rule out a network/firewall issue.

In addition to --debug-rbd 20, append --debug-ms 1 to rbd create command and let it hang for a minute or two before capturing the output.

@appcoders
Copy link
Author

Hi @idryomov and @Madhu-1 ,

thanks for your kind support. I am very grateful, especially for the last tip. It is a network problem after all. We are in clarification with the hoster, there is probably a problem with the cabling on site or configuration of switches.

Logging was the crucial piece of the puzzle for me:

192.168.72.54:0/2070396725 >> [v2:192.168.72.8:6808/2877638,v1:192.168.72.8:6809/2877638] conn(0x55bc92d80430 msgr2=0x55bc92d828c0 unknown :-1 s=STATE_CONNECTING_RE l=1).tick see no progress in more than 10000000 us during connecting to v2:192.168.72.8:6808/2877638, fault.

Its one host that is not reachable from the vm.

With this knowledge of how to get detailed logging, I hope that others will also be able to track down such errors more quickly.

So I will close this issue as I think the problem will vanish soon :-)

@appcoders
Copy link
Author

So now everything works fine after network configuration has been fixed. Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/deployment Helm chart, kubernetes templates and configuration Issues/PRs component/rbd Issues related to RBD
Projects
None yet
Development

No branches or pull requests

5 participants