Skip to content

Pull requests: HabanaAI/vllm-fork

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Extend accuracy tests for models that we support
#824 opened Feb 13, 2025 by AnetaKaczynska Loading…
Resolve Speculative Decode RTE
#823 opened Feb 13, 2025 by tannervoas742 Loading…
enable LoRA for embedding models
#821 opened Feb 12, 2025 by skaulintel Loading…
Bump transformers from 4.47.0 to 4.48.0 dependencies Pull requests that update a dependency file
#815 opened Feb 11, 2025 by dependabot bot Loading…
support inc dynamic quant deepseek
#814 opened Feb 11, 2025 by changwangss Loading…
Rebase 2025-02-10
#810 opened Feb 10, 2025 by kzawora-intel Loading…
support inc dynamic quantization
#803 opened Feb 8, 2025 by changwangss Loading…
Qwen2 vl
#802 opened Feb 7, 2025 by malkomes Draft
mszu/merged scheduler
#799 opened Feb 7, 2025 by szutenberg Draft
[WIP] Updating docs for the vLLM 1.20 release
#798 opened Feb 7, 2025 by PatrykWo Loading…
Support qwenvl model for HPU New Model Issue o PR to enable a new model
#793 opened Feb 7, 2025 by yingjie-han Loading…
Enable roberta embedding
#786 opened Feb 5, 2025 by yeonsily Loading…
Improve RMSNorm to support 2D inputs
#784 opened Feb 5, 2025 by YangQun1 Loading…
Recalc scales from user
#774 opened Feb 3, 2025 by linoybu Loading…
Fix warmup padding
#759 opened Jan 30, 2025 by mfylcek Draft
Initial enablement for text-embedding
#758 opened Jan 30, 2025 by libinta Loading…
Allow tests to run in t.compile
#724 opened Jan 22, 2025 by Kacper-Pietkun Loading…
Delayed sampling
#720 opened Jan 22, 2025 by mfylcek Draft
ProTip! Updated in the last three days: updated:>2025-02-10.