forked from quic/efficient-transformers
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix finite lorax generation in cb mode (quic#216)
The `examples/lora_models.py` script encounters issues in cb mode. This PR addresses the following: * Resolves the regression in finite lorax generation within cb mode in `QEfficient/generation/text_generation_inference.py` that occurred after the last refactoring. * Adds an additional unit test in `tests/peft/lora/test_lora_model.py` to verify the compile-generate flow for finite lorax cb mode. * [Addressed after comments] Uses auto device picking in `tests/peft/lora/test_lora_model.py`; Updates auto device picking option for `generate()` in `QEfficient/peft/lora/auto.py` Signed-off-by: Jou-An Chen <[email protected]>
- Loading branch information
1 parent
1517d6a
commit 05275e5
Showing
3 changed files
with
34 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters