Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solution Design: CyberArk Conjur Provider for Secret Store CSI Driver #2836

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

doodlesbykumbi
Copy link
Contributor

Desired Outcome

Please describe the desired outcome for this PR. Said another way, what was
the original request that resulted in these code changes? Feel free to copy
this information from the connected issue.

Implemented Changes

Describe how the desired outcome above has been achieved with this PR. In
particular, consider:

  • What's changed? Why were these changes made?
  • How should the reviewer approach this PR, especially if manual tests are required?
  • Are there relevant screenshots you can add to the PR description?

Connected Issue/Story

Resolves #[relevant GitHub issue(s), e.g. 76]

CyberArk internal issue ID: [insert issue ID]

Definition of Done

At least 1 todo must be completed in the sections below for the PR to be
merged.

Changelog

  • The CHANGELOG has been updated, or
  • This PR does not include user-facing changes and doesn't require a
    CHANGELOG update

Test coverage

  • This PR includes new unit and integration tests to go with the code
    changes, or
  • The changes in this PR do not require tests

Documentation

  • Docs (e.g. READMEs) were updated in this PR
  • A follow-up issue to update official docs has been filed here: [insert issue ID]
  • This PR does not require updating any documentation

Behavior

  • This PR changes product behavior and has been reviewed by a PO, or
  • These changes are part of a larger initiative that will be reviewed later, or
  • No behavior was changed with this PR

Security

  • Security architect has reviewed the changes in this PR,
  • These changes are part of a larger initiative with a separate security review, or
  • There are no security aspects to these changes

@codeclimate
Copy link

codeclimate bot commented Jun 22, 2023

Code Climate has analyzed commit ef5f736 and detected 164 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 1
Style 163

The test coverage on the diff in this pull request is 100.0% (50% is the threshold).

This pull request will bring the total coverage in the repository to 87.0% (-1.2% change).

View more on Code Climate.


#### Implementation Tasks

1. **Implement the Conjur Provider**: Develop a Go-based application that uses the CyberArk Conjur API to authenticate and fetch secrets. This application should implement the `CSIDriverProviderServer` interface, which includes the `Mount` and `Version` methods. These methods will need to communicate with the CyberArk Conjur API to fetch secrets and return them to the Secrets Store CSI Driver.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we identified during our Friday meeting or something that there's a pre-req task we need here to add authn-jwt capability to conjur-api-go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

authn-jwt is present in conjur-api-go, albeit requiring an environment variable for the token


2. **Create the Helm Charts**: Develop the necessary Kubernetes resource definitions (Deployment, Service, etc.) and values files to manage the deployment of the Conjur provider. This Helm chart will allow users to easily deploy and configure the Conjur provider in their Kubernetes clusters.

3. **Integration with CyberArk Conjur**: The logic for the provider to authenticate with the Conjur API and fetch secrets needs to be developed. This will involve using the `authn-jwt` authenticator for authentication with the workload service account token and making API calls to fetch secrets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that some of what we need to do here has been done before in Secrets Provider and the AuthnK8s client, how should we plan on using/repurposing that content?

We've talked before about Secrets Provider's packages being difficult to consume out-of-context - should we have a dev task to consider/address that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question. I think this is something we can figure out at implementation time, if we want to solve the issue across different packages or just use this project as a blank slate and an opportunity to create something easier to consume

- What's the best way to configure things like CA certs for the provider etc. ? Perhaps allowing the `SecretProviderClass` to reference other Kubernetes resources such `ConfigMap` or `Secret` is the answer
- What are our options for supporting authentication via authenticators other than authn-jwt ?
- What is the upgrade strategy for the CyberArk Conjur provider deployed via Helm?
- How to handle secret rotation in CyberArk Conjur?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated this

applianceUrl: https://conjur.some-org.com
account: some-org
authnLogin: path/to/some-policy/some-host
policyPath: path/to/some-policy/with-secrets # defaults to /
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance we could have gotten the secret IDs to fetch from the workload itself? To leverage our existing annotations that Secrets Provider supports.

authnUrl: https://conjur-auth.some-org.com
applianceUrl: https://conjur.some-org.com
account: some-org
authnLogin: path/to/some-policy/some-host
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should align the parameter names to what we have in Secrets Provider to reduce confusion.
I see that there we call it authnIdentity (reference)

@jvanderhoof
Copy link
Contributor

A couple of question and things I feel like are missing from this design:

  • Does the Kubernetes community include any guidance/expected interface/etc. for Secret Providers? I'm assuming a gRPC server is the correct way to accomplish this, but I don't see any references to the gRPC interface we'll be targeting.
  • It's likely my lack of knowledge about the CSI, but can you please include a diagram of how our custom gRPC service interacts with Conjur and Kubernetes to deliver secrets? It will help those with less deep knowledge better understand this design.
  • Can you please include some common workflow scenarios? From this document, I understand the happy path scenario (it "just works"). What would a scenario with bad credentials or lack of permission look like?
  • Operationally, how will we handle fixes/additional functionality in the future (aka, upgrades)?
  • What sort of messaging will be included for humans installing and maintaining this to understand and troubleshoot connection issues with Conjur (or installation issues)?

@jvanderhoof
Copy link
Contributor

@doodlesbykumbi, After reviewing the video, I have a much better sense of the approach of this work. That being said, I'd really like to see us expand this document to capture a fuller picture of the project intention and effort. The reason it's important we capture this in writing and not video is that by creating a full picture in a single design document, we provide an easy to find and reference resource for future engineers and product owners.

I feel this is really missing a crisp High Level Design overview. This should include:

  • As this is a new component, it will be very helpful to get a System High and/or a Container level diagram(s).
  • Summary of the gRPC API (this sounds like it'll be a copy/paste activity from the CSI Secrets page, but it'll help to have all the info in one place)
  • Workflows, one happy path and any likely sad-path flows. I'd love to understand how the Authn-JWT flow works without needing to dig through the PoC. I did couple of different flows for the Policy Factory SD using PlantUML.

As you've done the extensive PoC, it would also be valuable to at least summarize that work in a Low-Level Design section. A quick summary of key components of the gRPC service (even just a Compnent Diagram will help other engineers understand the code organization at a glance instead of extensive code review. To know if your PoC should be taken as a base and expanded or re-written base on your learnings, and the gaps you're aware of in functionality will be helpful for effort estimation as we bring this to fruition.

The high-level design is far more important than the low-level design (high level is limited to interfaces and interactions, low level is the code organization which implements those interfaces). The low level design makes more sense to document after feature completion rather than attempting guess exactly how the code should be organized.

One final comment. A solution design is really a story of a feature to be, in that spirit, can we figure out how to answer the questions in the context of the design instead of in a list at the end? Many of my questions were answered by those questions, but it made reviewing the solution design much more confusing.

Finally, writing is way harder than it looks (at least for me), if it's easier to just hop an a call and chat this through, let's do it! This is a great start, and to re-iterate, after reviewing the presentation, it's very obvious you have an excellent grasp of the scope and intricacies around this feature. This doc is an excellent start, and I just want to see more of those details reflected here.

@adamouamani
Copy link

@doodlesbykumbi - adding questions here for visibility..

  • what parameters are required? Can any be skipped (ie: policyPath)
  • Are there any char or path length limits for policy paths
  • helm versions supported?
  • code coverage expectations - min of 80%
  • should we add or consider a baseline performance test to compare against secrets-provider-for-k8s?
  • Are secrets retrieved individually or in batch, if one fails what happens to subsequent secrets
  • Any refactoring work we want to do in authn-jwt for the csi driver
  • Can we re-use any of the secrets provider tests or test framework for this
  • Would like to discuss more around specific testing scenarios and common customer use cases
  • Support on different platforms - ie: EKS, AKS, RANCHER, TANZU..
  • Versions of K8s/OCP supported

@doodlesbykumbi
Copy link
Contributor Author

Hey @jvanderhoof .This is great! I appreciate you taking time to give this thorough feedback. I've been exploring Plant UML over the past week and it's such a great way to convey certain kinds of information. I've made some updates to the solution design to address the points you raised

@doodlesbykumbi
Copy link
Contributor Author

doodlesbykumbi commented Aug 21, 2023

@adamouamani Thanks for posting these here

  1. what parameters are required? Can any be skipped (ie: policyPath)
    • Only policyPath should be optional
  2. Are there any char or path length limits for policy paths
    • With the current approach around defining secret specification in the SecretProviderClass instances, this falls under the constraints of keys and values for Kubernetes objects, see https://stackoverflow.com/a/53015758. Keys are 256 characters, and the value can go up to 1mb. So the policy path can be 1mb. While the file path can be 256 characters.
  3. helm versions supported?
    • Helm3
  4. code coverage expectations - min of 80%
    • That's fine. I think the unit tests can cover this base with relative ease
  5. should we add or consider a baseline performance test to compare against secrets-provider-for-k8s?
    • I'm not sure this is relevant
  6. Are secrets retrieved individually or in batch, if one fails what happens to subsequent secrets
    • The strategy demonstrated in the POC is in batch. If one fails then the whole request fails, and the workload never starts up. Which is something very nice about the CSI driver as compared to secrets provider.
  7. Any refactoring work we want to do in authn-jwt for the csi driver
    • Not that I'm aware of. I was able to make this work with authn-jwt as is
  8. Can we re-use any of the secrets provider tests or test framework for this
    • It's possible but I think it's more trouble than it is worth. I think this provider will get the most value from unit type tests
  9. Would like to discuss more around specific testing scenarios and common customer use cases
    • Happy to have those conversations. Though I don't imagine there'll be much novelty in the cases that need to be covered since the provider is relatively simple and really only responsible for fetching Secrets from Conjur. I don't think we need to test the CSI Driver, it's out of scope for us.
  10. Support on different platforms - ie: EKS, AKS, RANCHER, TANZU..
    • Same as below. As long as those different platforms meet the requirements of the Secrets Store CSI Driver it should just work.
  11. Versions of K8s/OCP supported
    • This will be documentation. The support will be conveniently linked to the Secrets Store CSI Driver, since the provider is merely a plugin. That support is currently 1.19+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants