Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Tekton-backed Kubeflow Pipelines #1554

Merged
merged 2 commits into from
Feb 2, 2022

Conversation

anishasthana
Copy link
Member

@anishasthana anishasthana commented Jan 25, 2022

Closes operate-first/support#435

Signed-off-by: Anish Asthana [email protected]

@sesheta sesheta added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 25, 2022
@sesheta sesheta added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jan 25, 2022
@sesheta sesheta requested review from durandom and larsks January 25, 2022 11:37
@anishasthana
Copy link
Member Author

Need to figure out the implications of the required SCC, probably move some of the contents of this PR around to different folders too.

@HumairAK
Copy link
Member

/hold

@sesheta sesheta added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 28, 2022
@HumairAK
Copy link
Member

Okay I'm going through the manifests. So off the bat, all the cluster-scope manifests should go to the cluster-scope directory here: https://github.com/operate-first/apps/tree/master/cluster-scope

See the readme docs in that link on how to structure it.

You might find this tool usefule for this https://github.com/larsks/halberd Remember you have to pull these into the smaug overlay.

Having said that, there are a lot of questionable manifests here ... in my opinion, at least for what it seems to be asking. And I will try to go through them.

@HumairAK
Copy link
Member

HumairAK commented Jan 28, 2022

This PR is installing csi-s3:

This is a Container Storage Interface (CSI) for S3 (or S3 compatible) storage. This can dynamically allocate buckets and mount them via a fuse mount into any container.

I don't think we want to just install a new CSI that can mount s3 storage w/o a separate discussion on why we should add it...

Do we need this? Which component needs it? Can we use the s3 we already have?

I see that this all results in a new storage class kfp-csi-s3 but...there are no pvs/pvcs that use it, leading me to believe it's not being utilized.

@HumairAK
Copy link
Member

HumairAK commented Jan 28, 2022

@anishasthana I also noticed in your demo deployed on the dev cluster that the cache-deployer and cache-server are not running successfully either. Is this something we need? Maybe we can try taking these out to reduce some of this clutter.

I skimmed through the manifests/configfiles/secrets and I also don't see any reference to the cache-deployer service. (could be I just missed it)

@HumairAK
Copy link
Member

HumairAK commented Jan 28, 2022

So there's a minio deployment also included, we could probably trim this down and just use an OBC to get our bucket. The following deployments seem to require the s3 creds:

apiVersion: apps/v1
kind: Deployment
metadata:
    name: ml-pipeline-ui
    namespace: kubeflow
---

apiVersion: apps/v1
kind: Deployment
metadata:
    name: ml-pipeline
    namespace: kubeflow

Both of these require s3 creds from a secret, we can sub our own s3 for these.
The secret they are currently mounting is: mlpipeline-minio-artifact
The deployment ml-pipeline also mounts a configmap called kfp-tekton-config which seems to take in the s3 endpoint.

@anishasthana anishasthana changed the title WIP: Add Tekton-backed Kubeflow Pipelines Add Tekton-backed Kubeflow Pipelines Jan 29, 2022
@sesheta sesheta removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 29, 2022
@anishasthana
Copy link
Member Author

We disabled creds-init by setting the disable-creds-init field in the feature-flag configmap in the openshift-pipelines namespace to True. This fixed another error.
Context: shipwright-io/build#679 (comment)

@anishasthana
Copy link
Member Author

track_artifacts: "false" in the kfp-tekton config map

@anishasthana anishasthana force-pushed the add_kfp_smaug branch 4 times, most recently from 4d471d4 to e42daf8 Compare February 2, 2022 17:33
Copy link
Member

@HumairAK HumairAK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@sesheta sesheta added the lgtm Indicates that a PR is ready to be merged. label Feb 2, 2022
@sesheta
Copy link
Member

sesheta commented Feb 2, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: HumairAK

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sesheta sesheta added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 2, 2022
@HumairAK HumairAK merged commit 6a202c1 into operate-first:master Feb 2, 2022
@anishasthana anishasthana deleted the add_kfp_smaug branch February 3, 2022 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide an OPF Kubeflow Pipelines instance
3 participants