You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running the containerized version of the Refact self hosting. When trying to use one of the models for chat (such as wizardlm) the chat always times out.
I have a system that has 2 Quadro P6000s:
The models load and I can see that inference is happening. However, VSCode times out before the processing has completed:
When I view nvtop during this time I can see that it is taking a while to process, but that should be ok for the self hosted model
I would expect that this should be configurable or at least not so short.
File is app.py. 134 lines (including comments) of python code. I understand I am not running this on an A100 card, so I expect the response to be slower. How can I reasonable interact with the chatbot in the self hosted model?
When I uncheck the "use app.py" option and only put in a code snippet by reducing the amount of code to 48 lines the VSCode extension still times out
The text was updated successfully, but these errors were encountered:
LOGS: watchdog_20231221.log
I am running the containerized version of the Refact self hosting. When trying to use one of the models for chat (such as wizardlm) the chat always times out.
I have a system that has 2 Quadro P6000s:
The models load and I can see that inference is happening. However, VSCode times out before the processing has completed:
When I view
nvtop
during this time I can see that it is taking a while to process, but that should be ok for the self hosted modelI would expect that this should be configurable or at least not so short.
File is app.py. 134 lines (including comments) of python code. I understand I am not running this on an A100 card, so I expect the response to be slower. How can I reasonable interact with the chatbot in the self hosted model?
When I uncheck the "use app.py" option and only put in a code snippet by reducing the amount of code to 48 lines the VSCode extension still times out
The text was updated successfully, but these errors were encountered: