- Red Hat OpenShift Data Science version 1.22 or above.
- Data Science Pipelines (follow the deployment guide for reference.)
Spin up a Minio instance by deploying the manifests/minio/minio.yaml
manifest. Once it's running, create the models
bucket through its GUI, which is exposed through the Minio route in project minio
. The credentials are:
- login, access key:
minio
- password, secret key:
minio123
Update and deploy manifests/odh/demo-pipeline-secret.yaml
:
s3_endpoint_url
: your S3 endpoint URL such ashttp://s3.openshift-storage.svc.cluster.local
s3_accesskey
: S3 access key with bucket creation permissions, for example value ofAWS_ACCESS_KEY_ID
in secretnoobaa-admin
in projectopenshift-storage
.s3_secret_key
: corresponding S3 secret key, for example value ofAWS_SECRET_ACCESS_KEY_ID
in secretnoobaa-admin
in projectopenshift-storage
.
- Enter or launch the Elyra KFNBC notebook in the Jupyter spawner page.
- Clone this repository.
- Open git client (
Git
in left toolbar). - Select
Clone a Repository
. - Enter the repository URL
https://github.com/mamurak/os-mlops.git
and selectClone
. - Authenticate if necessary.
- Open git client (
- Open
notebooks/elyra-kfp-onnx-example/model-training.pipeline
in the Kubeflow Pipeline Editor. - Select
Run Pipeline
in the top toolbar. - Select
OK
. - Monitor pipeline execution in the Kubeflow Pipelines user interface (
ds-pipelines-ui
route URL) underRuns
.
-
Change the available notebook deployment sizes.
- Find the
odh-dashboard-config
object of kindOdhDashboardConfig
in projectodh-applications
. - Add or update the
spec.notebookSizes
property. Checkmanifests/odh/odh-dashboard-config.yaml
for reference.
- Find the
-
Clone git repositories with JupyterLab.
- Open git client (
Git
in left toolbar). - Select
Clone a Repository
. - Enter the repository URL and select
Clone
. - Authenticate if necessary.
- Open git client (
-
Build and add custom notebook
- Deploy
manifests/odh/images/custom-notebook-is.yaml
. - Deploy
manifests/odh/images/custom-notebook-bc.yaml
. - Trigger build of the new build config and wait until build finishes.
- As an ODH admin user, open the
Settings
tab in the ODH dashboard. - Select
Notebook Images
andImport new image
. - Add new notebook with repository URL
custom-notebook:latest
and appropriate metadata. - Verify custom notebook integration in the JupyterHub provisioning page. You should be able to provision an instance of the custom notebook that you have defined in the previous step.
- Deploy
-
Add packages to custom notebook image with pinned versions.
- Within a custom notebook instance, install the package through
pip install {your-package}
. - Note the installed version of the package.
- Add a new entry in
container-images/custom-notebook/requirements.txt
with{your-package}=={installed-version}
. - Trigger a new image build.
- Once the build is finished, provision a new notebook instance using the custom notebook image. The new package is now available.
- Within a custom notebook instance, install the package through
-
Create Elyra pipelines within JupyterLab.
- Open the Launcher (blue plus symbol on top left corner of the frame).
- Select
Kubeflow Pipeline Editor
. - Drag and drop notebooks from the file browser into the editor.
- Build a pipeline by connecting the notebooks by drawing lines from the output to input ports of the node representation. Any directed acyclic graph is supported.
- For each node, update the node properties (right click on node and select
Open Properties
):Runtime Image
: Select the appropriate runtime image containing the runtime dependencies of the notebook.File Dependencies
: If the notebook expects a file to be present, add this file dependency here. It must be present in the file system of the notebook instance.Environment Variables
: If the notebook expects particular environment variables to be set, you can set them here.Kubernetes Secrets
: If you would like to set environment variables through Kubernetes secrets rather than defining them in the Elyra interface explicitly, you can reference the environment variables through the corresponding secrets in this field.Output Files
: If the notebook generates files that are needed by downstream pipeline nodes, reference these files here.
- Save the pipeline (top toolbar).
-
Submit Elyra pipeline to Kubeflow Pipelines backend.
- Open an existing pipeline within the Elyra pipeline editor.
- Select
Run Pipeline
(top toolbar). - Select the runtime configuration you have prepared before and click
OK
. - You can now monitor the pipeline execution within the Kubeflow Pipelines GUI under
Runs
.
- If you don't have cluster admin privileges and you're denied access to the ML Pipelines UI route, you may have to update the oauth proxy arguments in the
ds-pipeline-ui
deployment. Refer tomanifests/odh/ds-pipeline-ui-deployment.yaml
for details.
manifests
: OpenShift deployment artifacts.container-images
: dependencies of container builds.notebooks
: scripts and sample procedures for ODH component integration.
Noteworthy notebook examples in the notebooks
folder:
starburst-odf examples
: Examples of how to interact with S3 buckets and Starburst Enterprise Platform from a Jupyter notebook.elyra examples
: A sample pipeline demonstrating how to create artifacts that can be visualized in Kubeflow Pipelines.batch pipeline example
: A sample pipeline demonstrating parallel batch processing with Seldon.pipeline example
: A sample end-to-end ML pipeline including feature extraction, model training and model deployment through GitOps.