diff --git a/source/cloud/gcp/dataproc.md b/source/cloud/gcp/dataproc.md index bfe9c571..eb78cd48 100644 --- a/source/cloud/gcp/dataproc.md +++ b/source/cloud/gcp/dataproc.md @@ -18,9 +18,18 @@ $ gsutil cp gs://goog-dataproc-initialization-actions-${REGION}/rapids/rapids.sh **1. Create Dataproc cluster with Dask RAPIDS.** Use the gcloud command to create a new cluster. Because of an Anaconda version conflict, script deployment on older images is slow, we recommend using Dask with Dataproc 2.0+. +```{warning} +At the time of writing [Dataproc only supports RAPIDS version 23.12 and earlier with CUDA<=11.8 and Ubuntu 18.04](https://github.com/GoogleCloudDataproc/initialization-actions/issues/1137). + +Please ensure that your setup complies with this compatibility requirement. Using newer RAPIDS versions may result in unexpected behavior or errors. +``` + ```console $ CLUSTER_NAME= $ DASK_RUNTIME=yarn +$ RAPIDS_VERSION=23.12 +$ CUDA_VERSION=11.8 + $ gcloud dataproc clusters create $CLUSTER_NAME\ --region $REGION\ --image-version 2.0-ubuntu18\ @@ -31,7 +40,7 @@ $ gcloud dataproc clusters create $CLUSTER_NAME\ --initialization-actions=gs://$GCS_BUCKET/install_gpu_driver.sh,gs://$GCS_BUCKET/dask.sh,gs://$GCS_BUCKET/rapids.sh\ --initialization-action-timeout 60m\ --optional-components=JUPYTER\ - --metadata gpu-driver-provider=NVIDIA,dask-runtime=$DASK_RUNTIME,rapids-runtime=DASK\ + --metadata gpu-driver-provider=NVIDIA,dask-runtime=$DASK_RUNTIME,rapids-runtime=DASK,rapids-version=$RAPIDS_VERSION,cuda-version=$CUDA_VERSION\ --enable-component-gateway ```