Skip to content

Issues: triton-inference-server/fastertransformer_backend

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Failed to run on H100 GPU with tensor para=8
#166 opened Sep 15, 2023 by sfc-gh-zhwang updated Jul 3, 2024
Memory usage is doubled when loading a fp16 model into bf16 bug Something isn't working
#164 opened Sep 6, 2023 by skyser2003 updated Mar 18, 2024
tritonserver version
#173 opened Nov 2, 2023 by double-vin updated Nov 2, 2023
All flan-t5 doesn't work for me bug Something isn't working
#114 opened Apr 4, 2023 by PetroMaslov updated Sep 27, 2023
No response is received during inference in decoupled mode. bug Something isn't working
#169 opened Sep 26, 2023 by amazingkmy updated Sep 26, 2023
the docs are not updated with the source code.
#167 opened Sep 22, 2023 by trinhtuanvubk updated Sep 22, 2023
How to deploy multiple model in a node with multople GPUs bug Something isn't working
#165 opened Sep 14, 2023 by jjjjohnson updated Sep 14, 2023
Can i stop execution? (w/ decoupled mode) bug Something isn't working
#162 opened Aug 21, 2023 by Yeom updated Sep 12, 2023
Can I enable streaming on an ensemble model?
#155 opened Jul 18, 2023 by flexwang updated Aug 31, 2023
Streaming throwing queue.get() error bug Something isn't working
#44 opened Sep 13, 2022 by rtalaricw updated Aug 16, 2023
Do I need to specify ARG SM=80 when building the image manually?
#161 opened Aug 15, 2023 by sfc-gh-zhwang updated Aug 15, 2023
is_return_log_probs is required for decoupled model?
#160 opened Aug 9, 2023 by flexwang updated Aug 9, 2023
Failing to build with triton 23.04 bug Something isn't working
#150 opened Jun 30, 2023 by bronzafa updated Jul 3, 2023
huggingface_bert_convert.py can't convert some key bug Something isn't working
#152 opened Jul 3, 2023 by SeungjaeLim updated Jul 3, 2023
repo fails to build using Triton Image 23.01 bug Something isn't working
#93 opened Feb 13, 2023 by Chris113113 updated Jul 2, 2023
Is deberta supported in the fastertranformer backend?
#148 opened Jun 28, 2023 by sfc-gh-zhwang updated Jun 29, 2023
FasterTransformer Backend fails to build using latest version of Triton Server bug Something isn't working
#140 opened Jun 2, 2023 by mshuffett updated Jun 19, 2023
Why is it needed to set max_batch_size to 1 under interactive mode?
#143 opened Jun 12, 2023 by zhypku updated Jun 12, 2023
ProTip! Mix and match filters to narrow down what you’re looking for.