Allow to generate several samples for each prompt #533

rlouf · 2024-01-13T15:52:56Z

Closes #416

TODO

Use torch.repeat_interleave so samples of the same sequence in a batch are contiguous (easier for beam search)
Make sure the FSMs are corrrectly duplicated

Questions

Do we make the sequence generator return arrays of shape (num_samples * batch_size, num_tokens), and reshape in SequenceGenerator? Reshaping in the generator makes the latter much more complex. We could also not reshape at all, and let the user do it manually, which will simplify chained calls in the future.

outlines/generate/generator.py

outlines/generate/api.py

lapp0

I'm still learning the outlines/generate/ side of the code-base. If there are any components you would like test cases for, that would help me learn :)

IMHO we shouldn't reshape at all. Users should expect an array of results with the same length as their input prompts. A shape-agnostic decoder should be a final step so we aren't required to convert to decodable array and back again in multiple places.

outlines/generate/api.py

outlines/generate/generator.py

rlouf added enhancement transformers Linked to the `transformers` integration labels Jan 16, 2024

rlouf force-pushed the restore-multiple-samples branch 6 times, most recently from f558105 to 5cebac7 Compare January 20, 2024 10:19

rlouf commented Jan 21, 2024

View reviewed changes

outlines/generate/generator.py Outdated Show resolved Hide resolved

outlines/generate/api.py Show resolved Hide resolved

outlines/generate/api.py Show resolved Hide resolved

outlines/generate/api.py Show resolved Hide resolved

rlouf force-pushed the restore-multiple-samples branch 5 times, most recently from a1f9a8a to 88c8bc6 Compare January 23, 2024 07:00

rlouf mentioned this pull request Jan 23, 2024

Add temperature, top-k and top-p logits processors #335

Closed

rlouf force-pushed the restore-multiple-samples branch 3 times, most recently from 19f8eb5 to 4358032 Compare January 23, 2024 16:28

rlouf marked this pull request as ready for review January 23, 2024 16:28

lapp0 reviewed Jan 24, 2024

View reviewed changes

outlines/generate/api.py Show resolved Hide resolved

outlines/generate/api.py Outdated Show resolved Hide resolved

outlines/generate/api.py Outdated Show resolved Hide resolved

rlouf commented Jan 24, 2024

View reviewed changes

outlines/generate/generator.py Outdated Show resolved Hide resolved

rlouf force-pushed the restore-multiple-samples branch from 4358032 to 536cc7d Compare January 24, 2024 11:03

rlouf mentioned this pull request Jan 27, 2024

Add probability distribution to choices #479

Open

rlouf and others added 4 commits February 6, 2024 16:09

Re-install pydantic after vLLM

361c897

Catch JSON decoder error in sequence formatting function

e138813

Allow generation of multiple samples for each prompt

af948d3

Pass number of samples via sampler

126962a

rlouf force-pushed the restore-multiple-samples branch from 4035150 to 6989e99 Compare February 6, 2024 15:21

rlouf and others added 2 commits February 6, 2024 16:23

Move samplers module to root module

d817691

Add samplers documentation

eeb1b2d

Manage samples outside of sequence_generator

6989e99

rlouf merged commit e00c53f into dottxt-ai:main Feb 6, 2024
5 checks passed

rlouf deleted the restore-multiple-samples branch February 6, 2024 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow to generate several samples for each prompt #533

Allow to generate several samples for each prompt #533

rlouf commented Jan 13, 2024 •

edited

Loading

lapp0 left a comment

Allow to generate several samples for each prompt #533

Allow to generate several samples for each prompt #533

Conversation

rlouf commented Jan 13, 2024 • edited Loading

TODO

Questions

lapp0 left a comment

Choose a reason for hiding this comment

rlouf commented Jan 13, 2024 •

edited

Loading