You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- [x] (API) Fix return_prompt option to always return a value
(Client/API) Return the options that were directly passed to the model (wouldn't include unsupported options that were dropped).
Implement ban_eos_token
Implement custom_token_bans
Better (more generic) handling of API keys for corresponding clients (e.g. OpenAI client has hasKey method)
Return token count from local clients (is the only way to use .generate()?)
Implement streaming completion
Add "prefix" to LLM client. If defined and .list_models is implemented, will use this before model names when listing all models (replaces hardcoded "openai:")
Work on model downloading. It worked when I tried it once but it was pretty simple. At least need stuff like setting name to use for downloaded file/folder. Also, allow passing link to .gguf to download it to models_dir.
Figure out some system for picking a number of layers for GPU offloading for models that support it
The text was updated successfully, but these errors were encountered:
return_prompt
option to always return a valueban_eos_token
custom_token_bans
hasKey
method).generate()
?).list_models
is implemented, will use this before model names when listing all models (replaces hardcoded "openai:").gguf
to download it tomodels_dir
.The text was updated successfully, but these errors were encountered: