Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker run support for HF_TOKEN authentication using env var pass in #23

Closed
tstescoTT opened this issue Oct 24, 2024 · 1 comment
Closed
Labels
enhancement New feature or request

Comments

@tstescoTT
Copy link
Contributor

To keep convention with vLLM Docker containers described in https://github.com/vllm-project/vllm/blob/main/docs/source/serving/deploying_with_docker.rst, e.g.:

docker run --runtime nvidia --gpus all \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HUGGING_FACE_HUB_TOKEN=<secret>" \
    -p 8000:8000 \
    --ipc=host \
    vllm/vllm-openai:latest \
    --model mistralai/Mistral-7B-v0.1

HUGGING_FACE_HUB_TOKEN is being deprecated in favor of HF_TOKEN as defined in https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables#deprecated-environment-variables, so it's recommended to use HF_TOKEN here.

This could be used to allow users to download models without having to enter credentials manually where possible.

@tstescoTT tstescoTT added the enhancement New feature or request label Oct 24, 2024
tstescoTT added a commit that referenced this issue Jan 14, 2025
… HF download

change log:
- add multiple model support using persistent_volume/model_envs/*.env
- setup using Hugging Face huggingface-cli to download models: llama model install script support for llama CLI and huggingface hub #14
- add model setup for llama 3.x
- address Initial vLLM setup fails due to missing HuggingFace permissions #37
- address Docker run support for HF_TOKEN authentication using env var pass in #23
- renamed vllm-tt-metal-llama3-70 to vllm-tt-metal-llama3 for all llama 3.x models
- updated documentation for v0 drop
- add Docker Ubuntu 22.04 option for vLLM llama 3.x
tstescoTT added a commit that referenced this issue Jan 15, 2025
… HF download (#67)

* Llama 3.x model support, setup.sh script multiple model support using HF download

change log:
- add multiple model support using persistent_volume/model_envs/*.env
- setup using Hugging Face huggingface-cli to download models: llama model install script support for llama CLI and huggingface hub #14
- add model setup for llama 3.x
- address Initial vLLM setup fails due to missing HuggingFace permissions #37
- address Docker run support for HF_TOKEN authentication using env var pass in #23
- renamed vllm-tt-metal-llama3-70 to vllm-tt-metal-llama3 for all llama 3.x models
- updated documentation for v0 drop
- add Docker Ubuntu 22.04 option for vLLM llama 3.x

* use vllm.llama3.src.shared.Dockerfile for shared build steps for ubuntu 22.04 and 20.04 Dockerfiles

* use full url TT_METAL_DOCKERFILE_URL to allow for 1 Dockerfile for multiple base images
@tstescoTT
Copy link
Contributor Author

Addressed in #67

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant