-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
azcopy can't resume cloning job if controller pod is killed during job run #1950
Comments
yes, that's a limitation since the job status can only be stored locally in |
How can users detect such state? How can they recover / resume? Who deletes incompletely cloned volumes in Azure? There is no PV created for them. How is an user supposed to find them in the first place? I am afraid the cloning is not very useful if it can leak volumes that need manual cleanup. |
Can Kubernetes |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
What happened:
While testing azure file cloning in OpenShift we noticed that if a controller pod running azcopy job is killed any clone PVC still being copied gets stuck in Pending phase.
This seems to be the effect of azcopy relying on job plan files (
AZCOPY_JOB_PLAN_LOCATION
or~/.azcopy
by default) to track jobs which can be lost if stored on ephemeral volume.Is this a know limitation or is there a recommended solution?
What you expected to happen:
Clone job surviving a lost controller pod.
How to reproduce it:
Anything else we need to know?:
Checking the helm charts in this repo the destination for those job plan files seems to be ephemeral with
emptyDir
volume so the issue would occur with this deployment as well:azurefile-csi-driver/charts/v1.30.2/azurefile-csi-driver/templates/csi-azurefile-controller.yaml
Line 229 in eefed91
Environment:
kubectl version
):uname -a
):The text was updated successfully, but these errors were encountered: