forked from mlc-ai/mlc-llm
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Logprobs in MLC Batch Serving #82
Merged
masahi
merged 42 commits into
octoml:batch-serving
from
zxybazh:feature/2023-11-22/enable-mlc-server-logprobs
Jan 31, 2024
+376
−53
Merged
Changes from 1 commit
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
ab47b41
Squashed commit for logprobs implementation.
zxybazh 86f6fa1
fix None check
9a29650
Change detokenization to using token ids.
zxybazh 012388d
Fix wrong usage of token ids. Remove logging.
zxybazh db31164
extend benchmarks for logprobs
be81755
fix test without logprobs
e8ec3fc
clean code
49187f5
black format engine_common.py
013ed5a
logprobs is strictly bool, top_logprobs is int
79ec413
refactor logprob info collection to not reduce performance
fca1a6f
quick fix for check
675b631
review fix
18f80fa
fix list index out of range
29ea525
rollback after rebase
aa99322
test
8fa785e
Merge pull request #7 from Deelvin/vc/benchmark
d57b197
Squashed commit for logprobs implementation.
zxybazh 7995c84
fix None check
ae3fc5b
Change detokenization to using token ids.
zxybazh 0cb036f
Fix wrong usage of token ids. Remove logging.
zxybazh ed51e7d
extend benchmarks for logprobs
ff17ae2
fix test without logprobs
f5e4339
clean code
a3f6e8b
black format engine_common.py
c54a410
logprobs is strictly bool, top_logprobs is int
379d991
refactor logprob info collection to not reduce performance
58bac8f
quick fix for check
7de8d88
review fix
661fa18
fix list index out of range
6662a65
rollback after rebase
970d7f8
test
c58d69c
small fix
ebae200
rename for the sake of clarity
b2863d5
some fixes with cpu-gpu tensor copying
57b3a35
refactor logprob pass to calculate
4e29403
remove excess deps for token detokenization
a9157b9
small clean
39efb61
small clean
601e68d
return None instead of list of Nones
4f9241b
resolve conflicts
7ec21a7
fix mypy
7aa60ed
Merge pull request #8 from Deelvin/vc/perf
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
black format engine_common.py
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vvchernov Please make sure to remove this
decode
since it is incorrect (shouldn't be applied to a single token) and the same token has been detokenized already bydetokenize_incrementally
before this function is called.