Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gpu] Exercise new template-generated GPU driver installer #1290

Draft
wants to merge 125 commits into
base: master
Choose a base branch
from

Conversation

cjac
Copy link
Contributor

@cjac cjac commented Jan 8, 2025

This PR is for exercising the generated test against unmodified master test suite

@cjac cjac self-assigned this Jan 8, 2025
@cjac
Copy link
Contributor Author

cjac commented Jan 8, 2025

/gcbrun

@cjac cjac force-pushed the gpu-template-20250107 branch from e70ed0c to 85e3ec7 Compare January 8, 2025 02:42
@cjac
Copy link
Contributor Author

cjac commented Jan 8, 2025

/gcbrun

@cjac cjac force-pushed the gpu-template-20250107 branch from 85e3ec7 to b412084 Compare January 8, 2025 03:46
@cjac
Copy link
Contributor Author

cjac commented Jan 8, 2025

/gcbrun

@cjac cjac force-pushed the gpu-template-20250107 branch from b412084 to c9a22d6 Compare January 8, 2025 04:14
@cjac
Copy link
Contributor Author

cjac commented Jan 8, 2025

/gcbrun

@cjac cjac force-pushed the gpu-template-20250107 branch from c9a22d6 to ca0945b Compare January 8, 2025 04:47
@cjac
Copy link
Contributor Author

cjac commented Jan 8, 2025

/gcbrun

@cjac cjac force-pushed the gpu-template-20250107 branch from ca0945b to 7167512 Compare January 8, 2025 05:41
@cjac
Copy link
Contributor Author

cjac commented Jan 8, 2025

/gcbrun

1 similar comment
@cjac
Copy link
Contributor Author

cjac commented Jan 8, 2025

/gcbrun

cjac added 8 commits January 8, 2025 22:09
* increased minimum memory threshold for ram disk
* moved apt_add_repo and friends to common/install_functions

templates/dask/util_functions:
* validating conda tarball before caching to gcs

templates/generate-action.pl:
* improved usage documentation a little

templates/gpu/install_functions
* using /opt/conda/miniconda3/bin/python3 instead of /usr/bin/ for
  venv pre-install
* increase wait time for scheduler to come online
* reduce noise from tar -t

templates/gpu/yarn_functions,
templates/gpu/install_functions:
* protect many functions from running without attached accelerator

templates/gpu/install_gpu_driver.sh.in
* set +e in exit handler

templates/gpu/spark_functions:
* re-factor new function into this template

templates/spark-rapids/spark-rapids.sh.in
* removed redundant call to configure_gpu_script
* set +e in exit handler
@cjac cjac force-pushed the gpu-template-20250107 branch 2 times, most recently from b7e53d1 to 1a6fd39 Compare January 9, 2025 20:27
@cjac
Copy link
Contributor Author

cjac commented Jan 9, 2025

/gcbrun

@cjac cjac force-pushed the gpu-template-20250107 branch from 21818bd to 81bd2f4 Compare January 9, 2025 21:35
@cjac cjac force-pushed the gpu-template-20250107 branch 2 times, most recently from 8b94c08 to 997a61c Compare January 9, 2025 22:37
@cjac
Copy link
Contributor Author

cjac commented Jan 9, 2025

/gcbrun

@cjac cjac changed the title [gpu] this PR for testing purposes only [gpu] Exercise new template-generated GPU driver installer Jan 10, 2025
@cjac cjac force-pushed the gpu-template-20250107 branch from 997a61c to 6bb948b Compare January 10, 2025 04:25
@cjac
Copy link
Contributor Author

cjac commented Jan 10, 2025

/gcbrun

gpu/test_gpu.py:
* using tests from GoogleCloudDataproc#1275

gpu/verify_pyspark.py:
* new test file ; will probably be moved to mlvm

templates/gpu/install_gpu_driver.sh.in:
* this action template includes only the code unique to this action
@cjac cjac force-pushed the gpu-template-20250107 branch from 6bb948b to 453e6e6 Compare January 10, 2025 04:32
@cjac
Copy link
Contributor Author

cjac commented Jan 10, 2025

/gcbrun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant