-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to Define and Use Different Docker Images for Different Tasks in Kedro-Vertex (vertexai.yml) #129
Comments
Hi @7pandeys, thanks for raising the issue! The ability to define and use different Docker images for distinct tasks is not supported by this plugin at this point. The reason for this is that Kedro is primarily focused on creating reproducible pipelines rather than orchestration. As such, there is an assumption of a single Docker image per pipeline to make it easier for the users to use. We understand that this might not align with your specific, fairly advanced use case, more focused on the orchestration part. If you’re keen to contribute, we would be happy to help you design this feature and accept a PR - it would add a nice feature to the plugin 🙂 |
Potential solution could be based on node/pipeline tags with tag-docker image dictionary provided in the config as optional with default image staying as is. |
Also could you expand @7pandeys why having distinct docker images for different tasks is important for you? What does it optimize? The only thing I can think of is when you want to use different architectures for different computing steps and code is incompatible for both using just a single image. Apart from that isn't just docker image size and thus network bandwidth optimized here? |
Hi @Lasica, can you please elaborate 🙂 ? |
I was refering to potential implementation of the solution. Such use case is not yet supported as marrrcin stated. I think that feature that adds job grouping based on tags could have also feature to differentiate params for such groups based on tags with some dictionary mapping those in the config. |
Could you elaborate why that feature is needed/useful? @7pandeys |
Hi @Lasica, IMO is normal once you get serious with production ready pipelines, to have different compute architectures underlying different steps and different dependencies for each step. An inference related step is in need of a much lighter compute and dependencies, compared to a training step which might need a GPU and specific packages and an image that is tightly coupled with the compute architecture, or compared with pre-processing which might need a heavy CPU. I do understand that having a single image makes it easy for an entry level starting project but I do not see how you won't end with a giant image with all dependencies for each step (which might not be compatible), and in turn limiting the type of underlying compute that the pipeline can use. Please let us know, if we are miss-understanding the way to use Kedro VertexAI plug in, or any advice on how the community is tackling this problem would be of great help. We are currently evaluating cloud agnostic tools for model pipelines and evaluating if we build our own internally. This feature is one of the requirements/questions that came up while discussing about Kedro. and maybe one final question, is it possible to set a different machine type per step? |
Happy to evaluate how difficult would be to add this to Kedro BTW. We would need to see the SageMaker side as well, hope there is a Kedro SageMaker plugin |
@rragundez # excerpt from vertexai.yml
# see https://kedro-vertexai.readthedocs.io/en/0.9.1/source/02_installation/02_configuration.html
# Optional section allowing adjustment of the resources
# reservations and limits for the nodes
resources:
# For nodes that require more RAM you can increase the "memory"
data-import-node:
memory: 2Gi
# Training nodes can utilize more than one CPU if the algorithm
# supports it
model-training-node:
cpu: 8
memory: 60Gi
# GPU-capable nodes can request 1 GPU slot
tensorflow-node:
gpu: 1
# Resources can be also configured via nodes tag
# (if there is node name and tag configuration for the same
# resource, tag configuration is overwritten with node one)
gpu_node_tag:
cpu: 1
gpu: 2
# Default settings for the nodes
__default__:
cpu: 200m
memory: 64Mi
# Optional section allowing to configure node selectors constraints
# like gpu accelerator for nodes with gpu resources.
# (Note that not all accelerators are available in all
# regions - https://cloud.google.com/compute/docs/gpus/gpu-regions-zones)
# and not for all machines and resources configurations -
# https://cloud.google.com/vertex-ai/docs/training/configure-compute#specifying_gpus
node_selectors:
gpu_node_tag:
cloud.google.com/gke-accelerator: NVIDIA_TESLA_T4
tensorflow-step:
cloud.google.com/gke-accelerator: NVIDIA_TESLA_K80 Using different docker image per step is not supported an will unlikely be supported because it plays against the Kedro design and against pipeline reproducibility. Kedro is not an orchestration framework. As for your second question:
|
I think it's fair to request for all parameters that are available in the vertex ai node config api to be configurable somehow. We probably should probably take a fresh look at how this configuration should look like, taking into account the upcoming grouping feature that also should take into account other methods of grouping than just tags that has to keep config valid/consistent among the groups. Such change however could be a breaking change so let's take time to plan it with deprecated use of old way. |
Problem:
I am facing difficulties in Kedro-Vertex when trying to define and use different Docker images for distinct tasks, such as data preprocessing and model training, model inference.
Expected Behavior:
I expect to be able to specify and use separate Docker images for various tasks within my Kedro-Vertex workflow. This flexibility is crucial for optimizing resource utilization and dependencies for different stages of my vertex pipeline.
Current Behavior:
I have scoured the documentation and explored the codebase but have not found clear instructions on how to achieve this feature. As a result, I'm uncertain about how to implement different Docker images for different tasks.
Steps to Reproduce:
Additional Information:
Environment:
Suggested Solution:
It would be incredibly valuable to provide documentation or examples demonstrating how to define and use different Docker images for various tasks within a Kedro-Vertex project. If this feature is not currently supported, it would be helpful to know its status and any potential workarounds.
Related links:
https://github.com/getindata/kedro-vertexai/blob/develop/kedro_vertexai/config.py
https://kedro-vertexai.readthedocs.io/en/0.9.1/source/02_installation/02_configuration.html
Notes:
vertexai.yml is generated by command kedro vertexai init
This issue is aimed at improving the flexibility and resource management in Kedro-Vertex by allowing users to define and use different Docker images for different tasks. Your attention to this matter is greatly appreciated.
The text was updated successfully, but these errors were encountered: