-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ray Core] - Add ability to specify gpu memory resources in addition to gpu units #37574
Comments
cc @ericl Do you have any thoughts on this issue? |
We've been discussing this for LLM serving use cases, and this would solve some problems but not all problems of scheduling large models. It's not a bad idea to add a logical "gpu_memory" resource automatically though, similar to how we add the "memory" logical resource. This could be done in the same code that adds the "accelerator_type" resource. |
@achordia20 could you add more detail here? What kind of workloads are you able to port between GPU types?
Asking because in our experience there's a lot of per-GPU-type configuration that needs to change 😄 . Are there workloads that can easily move between GPU types? |
I also would like to be able to schedule on graphics memory as well. I think it provides a better utilization strategy for the Ray users rather than an admin segmenting off certain gpus for one task vs another. My team sees all types of configurations, some very advanced GPU rigs and some simple. GPU memory seems to be the most logical method for requesting / allocating resources. |
Maybe one way we could support this is by translating gpu_memory into GPU requests of a specific accelerator type label(s) under the hood (i.e., it's syntactic sugar for manually specifying accelerator types). That way we wouldn't have to make changes to the scheduler internals. |
It's a bit strange to specify a percentage of a GPU that's required, since you don't know in advance the specs of the GPU the task will be scheduled on. |
Hi, want to quick update on this. So we have REP and prototype ready for review. Please try out and leave feedback! |
@thatcort @martystack @achordia20 have you guys have chance to take a look in the REP and try the prototype? |
I added a comment on the REP doc. Overall it looks good! It would be a nice improvement to be able to specify that a task needs multiple GPUs with a certain amount of memory on each. |
Description
Rather than just allowing just a num_gpus resource, it would be great to also have the ability to specify num_gpu_resources as a logical requirement.
Use case
This would allows us to port workloads across different gpu types very easily. Right now, we have to adjust gpu resource requests across different gpu's with different gpu memory available.
The text was updated successfully, but these errors were encountered: