Very Short Responses Running External Model Weights With a Llamafile #645
Unanswered
michaelgetachew-abebe
asked this question in
Q&A
Replies: 2 comments
-
Could you copy and paste the command you're running and the output? |
Beta Was this translation helpful? Give feedback.
0 replies
-
yes hi, I'm just stuck, I devote all my time to studying, very closely. thanks for the help.
@echo off
call .\llamafile.exe -m model-f16.gguf -t 52 -c 2048 -b 1024
[--languare-ru_RU]
-ngl 9999 [--gpu-AUTO]
-o log.txt
[--save\all\logits-main.log] [--log-test] [--log-enable] [--log-append]
[--interactive-first]
pause
…________________________________
From: Justine Tunney ***@***.***>
Sent: Friday, November 29, 2024 10:57 PM
To: Mozilla-Ocho/llamafile ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [Mozilla-Ocho/llamafile] Very Short Responses Running External Model Weights With a Llamafile (Discussion #645)
Could you copy and paste the command you're running and the output?
―
Reply to this email directly, view it on GitHub<#645 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AOOFRCBFHCAABKTX33QYXAL2DDBMDAVCNFSM6AAAAABSW3LSKKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCNBRHA3TANA>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello everyone,
I am a bit new to Llamafile. I was trying to run Llama2 using the 5 bit quantized gguf weights. However, I am experiencing very short responses is there a way I can adjust the response length in Llamafile? Or how can I make it respond with better and longer reponses?
Thanks in advance
Beta Was this translation helpful? Give feedback.
All reactions