Replies: 1 comment
-
You should be able to make a program to do it already, just run multiple instances of koboldcpp with different models on individual ports, and grab the responses from each of them, then pick the best and give that to the user. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
is there a chance in the future to build MoA as built in feature in koboldcpp ? or it has to be on llamacpp first? running 3 instance of 8B model seems better than running a 34B model.
Beta Was this translation helpful? Give feedback.
All reactions