Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama 3.x model support, setup.sh script multiple model support using HF download #67

Merged
merged 3 commits into from
Jan 15, 2025

Conversation

tstescoTT
Copy link
Contributor

change log:

… HF download

change log:
- add multiple model support using persistent_volume/model_envs/*.env
- setup using Hugging Face huggingface-cli to download models: llama model install script support for llama CLI and huggingface hub #14
- add model setup for llama 3.x
- address Initial vLLM setup fails due to missing HuggingFace permissions #37
- address Docker run support for HF_TOKEN authentication using env var pass in #23
- renamed vllm-tt-metal-llama3-70 to vllm-tt-metal-llama3 for all llama 3.x models
- updated documentation for v0 drop
- add Docker Ubuntu 22.04 option for vLLM llama 3.x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any other difference other than FROM local/tt-metal/tt-metalium/ubuntu-22.04-amd64:$TT_METAL_DOCKERFILE_VERSION between this Dockerfile an the one for 20.04?

Wondering if that could be parameterized and this combined into just a single file?

Copy link
Contributor Author

@tstescoTT tstescoTT Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried doing this with a multistage build, but it doesnt work well. Multistage builds are best when the fork happens later in the process and all combinations are desired to be built. Unfortunately, because tt-metal isnt publishing release images for Ubuntu 22.04 we have to build them manually here, and that makes avoiding the Ubuntu 22.04 build easier as the default so there are less steps.

One way to do this is to put as much of the RUN commands as possible into a setup.sh script and copy then run that, but this wont cover everything (e.g. CMD, COPY, etc.) and would require extra testing.

The best way I found to do this is to use TT_METAL_DOCKERFILE_URL instead of TT_METAL_DOCKERFILE_VERSION so we can pass in a different base image entirely. This supports locally built Ubuntu 22.04 images, GHCR published Ubuntu 20.04 tt-metal images, and will support GHCR Ubuntu 22.04 tt-metal images once those are published.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added this in 11141de

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants