We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why are the responses cut down in the middle?
The text was updated successfully, but these errors were encountered:
you need to update the generation_config.max_new_tokens = 200
to which how many max new tokens you want it to generate
Sorry, something went wrong.
generation_config = model.generation_config generation_config.max_new_tokens = 100 generation_config.temperature = 0.5 generation_config.top_p = 0.7 generation_config.num_return_sequences = 1 generation_config.pad_token_id = tokenizer.eos_token_id generation_config.eos_token_id = tokenizer.eos_token_id
I have this config but still it ends after completion of number of tokens. Is there way to stop at end of sentences like openai.
Increase the number of max new tokens to something like 400-500 to get bigger replies. Falcon7b can output max 2k tokens
No branches or pull requests
Why are the responses cut down in the middle?
The text was updated successfully, but these errors were encountered: