Docker run support for HF_TOKEN authentication using env var pass in #23

tstescoTT · 2024-10-24T20:57:31Z

To keep convention with vLLM Docker containers described in https://github.com/vllm-project/vllm/blob/main/docs/source/serving/deploying_with_docker.rst, e.g.:

docker run --runtime nvidia --gpus all \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HUGGING_FACE_HUB_TOKEN=<secret>" \
    -p 8000:8000 \
    --ipc=host \
    vllm/vllm-openai:latest \
    --model mistralai/Mistral-7B-v0.1

HUGGING_FACE_HUB_TOKEN is being deprecated in favor of HF_TOKEN as defined in https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables#deprecated-environment-variables, so it's recommended to use HF_TOKEN here.

This could be used to allow users to download models without having to enter credentials manually where possible.

The text was updated successfully, but these errors were encountered:

… HF download change log: - add multiple model support using persistent_volume/model_envs/*.env - setup using Hugging Face huggingface-cli to download models: llama model install script support for llama CLI and huggingface hub #14 - add model setup for llama 3.x - address Initial vLLM setup fails due to missing HuggingFace permissions #37 - address Docker run support for HF_TOKEN authentication using env var pass in #23 - renamed vllm-tt-metal-llama3-70 to vllm-tt-metal-llama3 for all llama 3.x models - updated documentation for v0 drop - add Docker Ubuntu 22.04 option for vLLM llama 3.x

… HF download (#67) * Llama 3.x model support, setup.sh script multiple model support using HF download change log: - add multiple model support using persistent_volume/model_envs/*.env - setup using Hugging Face huggingface-cli to download models: llama model install script support for llama CLI and huggingface hub #14 - add model setup for llama 3.x - address Initial vLLM setup fails due to missing HuggingFace permissions #37 - address Docker run support for HF_TOKEN authentication using env var pass in #23 - renamed vllm-tt-metal-llama3-70 to vllm-tt-metal-llama3 for all llama 3.x models - updated documentation for v0 drop - add Docker Ubuntu 22.04 option for vLLM llama 3.x * use vllm.llama3.src.shared.Dockerfile for shared build steps for ubuntu 22.04 and 20.04 Dockerfiles * use full url TT_METAL_DOCKERFILE_URL to allow for 1 Dockerfile for multiple base images

tstescoTT · 2025-01-21T21:32:38Z

Addressed in #67

tstescoTT added the enhancement New feature or request label Oct 24, 2024

changh95 mentioned this issue Nov 21, 2024

Initial vLLM setup fails due to missing HuggingFace permissions #37

Closed

tstescoTT mentioned this issue Dec 20, 2024

setup.sh script multiple model support using HF download #61

Closed

tstescoTT mentioned this issue Jan 14, 2025

Llama 3.x model support, setup.sh script multiple model support using HF download #67

Merged

tstescoTT closed this as completed Jan 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker run support for HF_TOKEN authentication using env var pass in #23

Docker run support for HF_TOKEN authentication using env var pass in #23

tstescoTT commented Oct 24, 2024

tstescoTT commented Jan 21, 2025

Docker run support for HF_TOKEN authentication using env var pass in #23

Docker run support for HF_TOKEN authentication using env var pass in #23

Comments

tstescoTT commented Oct 24, 2024

tstescoTT commented Jan 21, 2025