-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llama 3.x model support, setup.sh script multiple model support using HF download #67
Conversation
… HF download change log: - add multiple model support using persistent_volume/model_envs/*.env - setup using Hugging Face huggingface-cli to download models: llama model install script support for llama CLI and huggingface hub #14 - add model setup for llama 3.x - address Initial vLLM setup fails due to missing HuggingFace permissions #37 - address Docker run support for HF_TOKEN authentication using env var pass in #23 - renamed vllm-tt-metal-llama3-70 to vllm-tt-metal-llama3 for all llama 3.x models - updated documentation for v0 drop - add Docker Ubuntu 22.04 option for vLLM llama 3.x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any other difference other than FROM local/tt-metal/tt-metalium/ubuntu-22.04-amd64:$TT_METAL_DOCKERFILE_VERSION
between this Dockerfile an the one for 20.04?
Wondering if that could be parameterized and this combined into just a single file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried doing this with a multistage build, but it doesnt work well. Multistage builds are best when the fork happens later in the process and all combinations are desired to be built. Unfortunately, because tt-metal isnt publishing release images for Ubuntu 22.04 we have to build them manually here, and that makes avoiding the Ubuntu 22.04 build easier as the default so there are less steps.
One way to do this is to put as much of the RUN commands as possible into a setup.sh script and copy then run that, but this wont cover everything (e.g. CMD, COPY, etc.) and would require extra testing.
The best way I found to do this is to use TT_METAL_DOCKERFILE_URL instead of TT_METAL_DOCKERFILE_VERSION so we can pass in a different base image entirely. This supports locally built Ubuntu 22.04 images, GHCR published Ubuntu 20.04 tt-metal images, and will support GHCR Ubuntu 22.04 tt-metal images once those are published.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added this in 11141de
…tu 22.04 and 20.04 Dockerfiles
…ltiple base images
change log: