Skip to content

Commit

Permalink
Add chat_template and enforce_eager options to docsting of VLLMConfig
Browse files Browse the repository at this point in the history
  • Loading branch information
movchan74 committed Feb 20, 2024
1 parent f68c19c commit ea52411
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions aana/deployments/vllm_deployment.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ class VLLMConfig(BaseModel):
gpu_memory_reserved (float): the GPU memory reserved for the model in mb
default_sampling_params (SamplingParams): the default sampling parameters.
max_model_len (int): the maximum generated text length in tokens (optional, default: None)
chat_template (str): the name of the chat template, if not provided, the chat template from the model will be used
but some models may not have a chat template (optional, default: None)
enforce_eager (bool): whether to enforce eager execution (optional, default: False)
"""

model: str
Expand Down

0 comments on commit ea52411

Please sign in to comment.