v0.15.0
HyperQueue 0.15.0
Breaking changes
- NVIDIA GPUs are now automatically detected under the resource name
gpus/nvidia
, instead of
justgpus
! If you have been using thegpus
resource name, you should update your scripts.
See more details below.
New features
Resource management
-
You can now specify more resources for one task, e.g.: 1 cpu and 1 gpu OR 4 cpus. The scheduler considers both configurations in task planning.
For example let us assume that we have many tasks with the mentioned configuration and worker with 16 cpus and 4 gpus.
The tasks will fully utilize the node, 4 tasks will run in the configuration with gpu and 3 tasks will run in the cpu only mode. -
Job Definition File is a TOML file that can define a job.
It allows to submit complex jobs without using Python API (dependencies, resource variants, ...).$ hq job submit-file myfile.toml
-
You can now specify (indexed) resource values provided by workers as strings (previously only
integers were allowed). Notably, automatic detection of Nvidia GPUs specified with string UUIDs
now works.$ hq worker start --resource="res1=[foo, bar]"
-
HyperQueue now provides built-in support for AMD GPUs. For this reason, the default name of GPU
resources that are automatically detected on a worker has been changed fromgpus
togpus/nvidia
for NVIDIA GPUs. AMD GPUs are now autodetected asgpus/amd
. In the future, we intend to create a way
to ask for any GPU resource (e.g.--resource=gpus=2
), regardless of its type. -
AMD GPUs are now automatically detected in workers from the environment variable
ROCR_VISIBLE_DEVICES
. -
Allowed characters for resource names has been changed. The name now has to begin with an ASCII letter,
and it can only contain ASCII letters, ASCII digits and the slash (/
) symbol. This restriction is
introduced for better alignment with shells, which typically do not support complicated variable names.
HQ passes the resource names to executed tasks through environment variables, so it has to take this
into account. Note that the/
symbol in resource name will be normalized to_
when being passed
to a task. -
hq task info
now shows more information
Changes
Job submission
- The default path for
stdout
andstderr
files has been changed from%{SUBMIT_DIR}/job-%{JOB_ID}/%{TASK_ID}.[stdout/stderr]
to%{CWD}/job-%{JOB_ID}/%{TASK_ID}.[stdout/stderr]
. Note that the default value for the working
directory (%{CWD}
) is set to the submission directory, so if you have used the defaults before,
nothing will change for you. Stdout and stderr paths are now also resolved relative to the working
directory of the given task, not to the submit directory.
Artifact summary:
- hq-v0.15.0-*: Main HyperQueue build containing the
hq
binary. Download this archive to
use HyperQueue from the command line. - hyperqueue-0.15.0-*: Wheel containing the
hyperqueue
package with HyperQueue Python
bindings.