-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling of service accounts for BuildRuns #662
Comments
I think so, yes. Keeping SAs around until the BuildRun could lead to scaling issues, and if those SAs still have sensitive creds attached, they could be reused to perform malicious actions.
I think not, for reasons listed above. SAs are sensitive resources, and shouldn't exist if we don't expect to use them again. If the user wants to have their SA stick around, they can create an SA and reuse it, and more thoroughly audit its use. TBH I'm still a little fuzzy on the use case for single-use ephemeral SAs at all, and I think it adds quite a bit of conceptual load to users/operators to support it. I'd like to understand what problem they solve today, and if there's a better way (such as attaching secrets to BuildRuns and TaskRuns directly?) that would let us get out of the business of creating/destroying SAs. At the very least, some docs about when and why you might want to use |
First, about the two questions:
I think yes, because
Hi @imjasonh , First of all, at the beginning, I remember we did some investigation, and found we cannot use the Secret for the Tekton TaskRun directly, we can ONLY use ServiceAccount (https://github.com/tektoncd/pipeline/blob/main/pkg/apis/pipeline/v1beta1/taskrun_types.go#L50), so we have to append all secrets into a SA then pass the SA to the generated TaskRun in the Shipwright: https://github.com/shipwright-io/build/blob/master/pkg/reconciler/buildrun/resources/taskrun.go#L216 If we HAVE TO use the SA, then we have some problems:
So we provide a new |
The only open question I have here is around an enhancement in the API via https://github.com/shipwright-io/build/pull/608/files (which didnt happen). We wanted to enhance the BuildRun Status field, with a |
I think generating SAs on-demand and deleting them is an anti-pattern in general, and we should think about how we can stage the deprecation of this feature. People configuring builds should be aware that there's a SA with Secrets that will run that Build, and they should know who that is and be able to configure RBAC correctly for that SA. By having ephemeral, temporary SAs, we might make the API a bit easier to use, but we make it a lot harder to secure and audit. I don't think the UX improvement is worth it TBH. Enabling ephemeral SAs requires the Shipwright controller to have the power to create and destroy any SA in any namespace. This is pretty much god-level power, which Shipwright doesn't need to be able to run most builds. Operators can't opt out of ephemeral SA support today, except to remove that permission and see what breaks. |
I would actually remove the legacy ;) logic of looking for |
@sbose78 oh, is good that you say that, did something changed in OpenShift clusters? |
I believe the OpenShift builds has a similar challenge, and we addressed that by adding a dedicated service account - |
Nope, upstream Tekton doesn't create any SAs except the one it uses to run its own controller/webhook. |
by "OpenShift's original Tekton distriubtion" @adambkaplan was referring to the "downstream" version of Tekton that Vincent's team provides, i.e. "OpenShift Pipelines" @imjasonh The OLM Operator for that "downstream" offering certainly does add the But I suspect what @sbose78 is getting at is that shipwright should be "vendor neutral" and not depend on something that RH's downstream version of Tekton provides. At most, a future RH "downstream" version of Shipwright might leverage the |
And furthermore, as @adambkaplan alluded to, I am pretty sure one can say that the |
ok. Let me try to consolidate what is happening on this issue, so we might close it soon. [1] We agree we do not longer need to maintain a [2] I think @imjasonh is challenging the whole service account auto-generation mechanism. I investigated this during this week. What we currently use is what Tekton calls creds-init or build-in authentication, where via a reference to a service account, tekton will mount the secrets at the pod level in the right path. The main concerns here are:
It seems we can disable the Tekton build-in auth as seen in the docs in favor of we ensuring that the secrets are mounted at the pod level, as seen in the following examples: Via env vars, Via volumes, Via volumes + params. If [2] sounds reasonable. We should move that spike work in a different issue. Then we can safely close this ongoing issue. Opinions? |
(2) sounds reasonable, and thanks for writing this up. Please let me know if you have any questions while investigating, I'd be happy to help. |
+1 from me as well thanks |
In the context of getting rid of the service account (which would in general be a nice idea), I just clicked through the Tekton samples from Jason. Based on them, I can understand how we would mount a single container registry secret to a container. Two scenarios I do not see covered:
How are they resolved? |
Good question. 🤔 If there are multiple dockerconfig secrets available, we could merge them ourselves, perhaps using
Another good one, you're on a roll. 😄 Instead of using the git PipelineResource, we could prepend our own step that calls Basically, |
Right, that would all work. But then we would remove the usage of a couple of nice Tekton features (credential merging, taking care of running I think a few weeks ago we had a chat and I had not yet time to follow up on this. But the idea was basically to keep all this in place but to get rid of the service account by enhancing the TaskRunSpec to allow to specify the Secrets there directly instead of indirectly through a service account. I personally still prefer this approach. |
My point is most of those features are not going to make it to Tekton v1 anyway, so we shouldn't rely on them long-term. If being able to take advantage of these features is one of the main reasons we generate SAs, which brings in a load of negative effects, then we should get out of that business.
That would be fine too! I do think that Tekton still provides a useful TaskRun interface, that are much easier to use than Pods, but perhaps I'm biased. If over time Shipwright decides Tekton isn't providing value enough to surpass its costs, that's a reasonable move.
If the TaskRun specifies Secret-backed Volumes, it must also specify a ServiceAccount that has permission to read those Secrets (or |
The statement of removing Tekton was ironic. It still brings the logic to run non-init containers sequentially, for example. |
Phew, okay good. 😅 I don't want to reimplement those parts of Tekton ever again ...unless it's to push them up into k8s itself. 🤔 |
I think the details on the implications of getting rid of the sa´s should be addressed in #679 . I´m gonna close this issue soonish. |
Yeah, I think if Tekton is gonna remove the functionality we use today, then we have no choice than re-implementing some of the things and while doing that getting rid of the service account. I guess we will need to do a staged approach to cover all aspects as discussed above. |
@qu1queee I'm not sure why this was closed, I think there are a few somewhat large issues still unresolved, that might be able to be split into separate dependent issues:
|
Closed because of the comment in #662 (comment) , where I understood we where ok to move the particular discussion on what means to stop relying on sa´s via a separate issue. We can split, yes. |
Idea:
When dealing with BuildRuns we have different scenarios on the interactions with service accounts. I´m summarizing the two most prominent ones, where we have room for improvement.
[1] We rely on default service accounts in the cluster
Failed
if neither thedefault/pipeline
service accounts are found?[2] We autogenerate a service account for users when the
generate
field is used.Some questions on the above:
Failed
?. Note: Keep in mind that we use ownerreferences, so as soon as a BuildRun is deleted, the service account will also be deleted.The text was updated successfully, but these errors were encountered: