Performance on Steam deck #645
FrostyMisa
started this conversation in
Show and tell
Replies: 2 comments
-
Terminal can be viewed by launching koboldcpp via command line from a shell (like bash) |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks for tip. So here is the results.
Vulcan, 0 layers offloaded.
CPU only.
13B_Q5_K_M GGUF model:
The 7B model is fast, when all layers offloaded to GpU, doesn't use CPU and keep the fan almost quiet. Vulcan and 0 layers offloaded use CPU little bit, fan is noticeable. Only CPU is have very noticeable fan noise. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
First I want to thanks all who made Koboldcpp ❤️. I do some test with the latest version with Vulcan support and... It's great! Generation time is not much faster (but still great), but the prompt processing at the beginning is now lightning fast!
5 times faster with Vulcan (offload all layers to GPU). With this setup it doesn't use CPU, so it's keep the deck more on cooler temperatures, so fan doesn't scream.
4 times faster with Vulcan (0 layers offload to GPU). With this setup it use CPU, but only 1/3 compared to previous version of Koboldcpp with OpenBLAS.
If I know how to open the Koboldcpp console, I will write more accurate data here, but I don't know how and is running the process in background even I select bring Koboldcpp in foreground.
Beta Was this translation helpful? Give feedback.
All reactions