Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce --timeout-fatal to control readiness timeouts #409

Merged
merged 3 commits into from
Dec 11, 2023
Merged

Introduce --timeout-fatal to control readiness timeouts #409

merged 3 commits into from
Dec 11, 2023

Conversation

cromulentbanana
Copy link
Contributor

Problem Statement: I want to ensure that when linkerd-await is used for shutdown purposes, it will run its given command both in the case that the proxy sidecar container is injected to the pod, but also in the case where that proxy is not injected. The motivation here is that I have some cronjobs which need to run the same configured job in both a meshed, and possibly also unmeshed configuration, but where it's not practical to set the DISABLE environment variable in advance.

Under these conditions, the current behavior of linkerd-await leads to a timeout in the absence of the proxy and the job fails.

This modification allows linkerd-await to execute a command of a k8s job in either case where the proxy is or is not present.

I'm glad to take feedback and questions on this proposal.

@cromulentbanana cromulentbanana requested a review from a team as a code owner November 20, 2023 16:17
…ing in the absence of a linkerd-proxy

Signed-off-by: Dan Levin <[email protected]>
@cromulentbanana
Copy link
Contributor Author

Hi, is there any way to get feedback on whether this PR would be considered for inclusion?

@kflynn
Copy link
Member

kflynn commented Dec 1, 2023

@cromulentbanana Sorry for the delay here!! We'll be looking at this shortly – one immediate question is whether you've thought about how this might interact with the new sidecar KEP? I think we'll need to be thinking about that..

@cromulentbanana
Copy link
Contributor Author

cromulentbanana commented Dec 1, 2023

@cromulentbanana Sorry for the delay here!! We'll be looking at this shortly – one immediate question is whether you've thought about how this might interact with the new sidecar KEP? I think we'll need to be thinking about that..

Good question. In a nutshell, I expect that this force option should not introduce any undesired behaviors with the new sidecar functionality.

To the best of my understanding based on this documentation I believe that the changes I'm introducing with my force option when combined with the shutdown function should be orthogonal to and should not conflict with the new sidecar functionalities introduced in k8s 1.28

  1. It may be the case that if a restartPolicy is defined with value Always, then the linkerd-proxy sidecar may restart after it is terminated by linkerd-await however their documentation clearly states "Pod termination continues to only depend on the main containers. An init container with a restartPolicy of Always (named a sidecar) won't prevent the pod from terminating after the main containers exit."

  2. More generally, it may also be the case that with k8s >=1.28 linkerd-await itself may no longer be necessary at all for the purpose of ensuring the linkerd-proxy is running prior to the "main" container of a pod.

I haven't yet been able to evaluate points 1 or 2 experimentally.

@olix0r olix0r self-assigned this Dec 4, 2023
Copy link
Member

@olix0r olix0r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cromulentbanana Thanks for the PR! This makes sense in general.

What do you think about renaming this flag to be something like --timeout-fatal=false (with a default to true to preserve the existing behavior)?

My reasoning is: this behavior only takes effect when a timeout is specified. If the timeout is unset, then we'll never execute the command; so --force-exec doesn't feel accurate.

@cromulentbanana
Copy link
Contributor Author

@cromulentbanana Thanks for the PR! This makes sense in general.

What do you think about renaming this flag to be something like --timeout-fatal=false (with a default to true to preserve the existing behavior)?

My reasoning is: this behavior only takes effect when a timeout is specified. If the timeout is unset, then we'll never execute the command; so --force-exec doesn't feel accurate.

this sounds perfectly reasonable. Feel free to change the name to your suggestion, or let me know if I should do so.

…l and continue executing the command and/or shutting down

Signed-off-by: Dan Levin <[email protected]>
@cromulentbanana
Copy link
Contributor Author

@olix0r please let me know if this looks acceptable.

@cromulentbanana
Copy link
Contributor Author

hi, is anyone able to pick up the review/approval of this PR?

@olix0r
Copy link
Member

olix0r commented Dec 11, 2023

@cromulentbanana sorry for the lag. it's in my list to review today.

@olix0r
Copy link
Member

olix0r commented Dec 11, 2023

@cromulentbanana OK, I've figured out the clap-fu needed to support the suggestion I made (--timeout-fatal=false); I'm going to push that change to your branch (a long with the formatting fix needed to pass CI) and get this merged. Thanks for the submission! 🎉

@olix0r olix0r changed the title Introduce force-execute option to increase robustness Introduce --timeout-fatal to control readiness timeouts Dec 11, 2023
@olix0r olix0r merged commit 7f29fda into linkerd:main Dec 11, 2023
4 checks passed
@cromulentbanana cromulentbanana deleted the feature/force_exec branch December 12, 2023 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants