Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop token during Inference #3

Open
sankethgadadinni opened this issue Jun 26, 2023 · 3 comments
Open

Stop token during Inference #3

sankethgadadinni opened this issue Jun 26, 2023 · 3 comments

Comments

@sankethgadadinni
Copy link

Why are the responses cut down in the middle?

@NisaarAgharia
Copy link
Owner

you need to update the
generation_config.max_new_tokens = 200

to which how many max new tokens you want it to generate

@sankethgadadinni
Copy link
Author

generation_config = model.generation_config
generation_config.max_new_tokens = 100
generation_config.temperature = 0.5
generation_config.top_p = 0.7
generation_config.num_return_sequences = 1
generation_config.pad_token_id = tokenizer.eos_token_id
generation_config.eos_token_id = tokenizer.eos_token_id

I have this config but still it ends after completion of number of tokens. Is there way to stop at end of sentences like openai.

@NisaarAgharia
Copy link
Owner

Increase the number of max new tokens to something like 400-500 to get bigger replies. Falcon7b can output max 2k tokens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants