The AIC Operator enables the Qualcomm® Cloud AI series of hardware on OpenShift clusters by automating the configuration and installation of their Linux device drivers and setting up its Device Plugin.
This Operator relies on the Node Feature Discovery (NFD) and Kernel Module Management (KMM) Operators. Be sure to install them from the OperatorHub (provided by Red Hat, not the Community).
NFD and KMM both require addition configuration after they're installed. NFD requires the CRD located in ./manual_install/qcom-aic-rule-nfd.yaml to be added to the cluster.
oc apply -f ./manual_install/qcom-aic-rule-nfd.yaml
KMM also requires configuration so that firmware can be located correctly. The following command should work for most clusters, but make sure to check that the 'controler_config.yaml' section matches the existing configuration (note: ordering of the elements shouldn't matter (so long as they're under the correct heading (e.g. 'webhook', 'worker', etc.)), but their existence does).
oc patch configmap kmm-operator-manager-config -n openshift-kmm --type='json' -p='[{"op": "add", "path": "/data/controller_config.yaml", "value": "healthProbeBindAddress: :8081\nmetricsBindAddress: 127.0.0.1:8080\nleaderElection:\n enabled: true\n resourceID: kmm.sigs.x-k8s.io\nwebhook:\n disableHTTP2: true\n port: 9443\nmetrics:\n enableAuthnAuthz: true\n disableHTTP2: true\n bindAddress: 0.0.0.0:8443\n secureServing: true\nworker:\n runAsUser: 0\n seLinuxType: spc_t\n setFirmwareClassPath: /var/lib/firmware"}]'
The important part added in the above config patch is "\n setFirmwareClassPath: /var/lib/firmware". Due to the structure of the kmm-operator-manager-config configmap, that can't be added on its own.
Check that the existing configuration matches outside the firmware path:
oc get configmap kmm-operator-manager-config -n openshift-kmm -o=json
After updating the KMM configuration, be sure to restart the KMM controller:
oc delete pod -n openshift-kmm -l app.kubernetes.io/component=kmm
Now, on with building and deploying the AIC Operator.
- go version v1.20.0+
- docker version 17.03+.
- kubectl version v1.11.3+.
- Access to a Kubernetes v1.11.3+ cluster.
Build and push your image to the location specified by IMG
:
make docker-build docker-push IMG=<some-registry>/cloud_ai_openshift_operator:tag VERSION=<version>
make bundle-build bundle-push IMG=<some-registry>/cloud_ai_openshift_operator:tag BUNDLE_IMG=<some-registry>/cloud_ai_openshift_operator_bundle:tag VERSION=<version>
NOTE: This image ought to be published in the personal registry you specified. And it is required to have access to pull the image from the working environment. Make sure you have the proper permission to the registry if the above commands don’t work.
Install the CRDs into the cluster:
make install
Deploy the Manager to the cluster with the image specified by IMG
:
make deploy IMG=<some-registry>/cloud_ai_openshift_operator:tag
NOTE: If you encounter RBAC errors, you may need to grant yourself cluster-admin privileges or be logged in as admin.
Create instances of your solution You can apply the samples (examples) from the config/sample:
kubectl apply -k config/samples/
NOTE: the provided sample defaults rely on usage of the cluster's internal registry. The following image tags should be available:
image-registry.openshift-image-registry.svc:5000/${AIC_NAMESPACE}/quic_aic_device_plugin:0.1
image-registry.openshift-image-registry.svc:5000/${AIC_NAMESPACE}/quic_aic_src:0.1
Delete the instances (CRs) from the cluster:
kubectl delete -k config/samples/
Delete the APIs(CRDs) from the cluster:
make uninstall
UnDeploy the controller from the cluster:
make undeploy
For more detailed info on contributions see the CONTRIBUTING file.
NOTE: Run make help
for more information on all potential make
targets
More information can be found via the Kubebuilder Documentation
AIC Operator is licensed under the terms of the LICENSE file.