Model name for vLLM instantiation #71

rpavlovicTT · 2025-01-20T19:44:42Z

In run_vllm_server.py script vLLM server is configured with hf_model_name as

def model_setup(hf_model_id):
    # TODO: check HF repo access with HF_TOKEN supplied
    print(f"using model: {hf_model_id}")
    args = {
        "model": hf_model_id,
...

but in vLLM TT worker implementation we have for example:

        if ("meta-llama/Meta-Llama-3.1-8B" in self.model_config.model...

setup.sh generates following vars

$HF_MODEL_REPO_ID=meta-llama/Llama-3.1-8B-Instruct
$META_MODEL_NAME=Meta-Llama-3.1-8B-Instruct

but neither of them explicitly matches string expected by vLLM... so we should consolidate these naming conventions.

The text was updated successfully, but these errors were encountered:

tstescoTT · 2025-01-21T22:39:39Z

FYI some WIP on this in https://github.com/tenstorrent/tt-inference-server/blob/tstesco/dev/setup.sh

Plan is to consolidate to HF repo ID, but we have some legacy model impls that differ. We can use HF model ID downstream of setup going forward.

tstescoTT · 2025-01-21T22:44:37Z

I think we should be handling the *-Instruct model versions explicitly because they have different handling for tokenization and chat templating.

tstescoTT · 2025-02-06T02:32:19Z

addressed in #88

tstescoTT closed this as completed Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model name for vLLM instantiation #71

Model name for vLLM instantiation #71

rpavlovicTT commented Jan 20, 2025

tstescoTT commented Jan 21, 2025 •

edited

Loading

tstescoTT commented Jan 21, 2025

tstescoTT commented Feb 6, 2025

Model name for vLLM instantiation #71

Model name for vLLM instantiation #71

Comments

rpavlovicTT commented Jan 20, 2025

tstescoTT commented Jan 21, 2025 • edited Loading

tstescoTT commented Jan 21, 2025

tstescoTT commented Feb 6, 2025

tstescoTT commented Jan 21, 2025 •

edited

Loading