Skip to content

Commit

Permalink
Merge pull request #75 from mobiusml/hf_transfer
Browse files Browse the repository at this point in the history
Add hf-transfer for faster HF Hub ops
  • Loading branch information
movchan74 authored Mar 15, 2024
2 parents 11a1443 + a1db2c7 commit d918ba9
Show file tree
Hide file tree
Showing 5 changed files with 81 additions and 6 deletions.
3 changes: 2 additions & 1 deletion .env
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
CUDA_VISIBLE_DEVICES=""
USE_DEPLOYMENT_CACHE = True
SAVE_DEPLOYMENT_CACHE = True
SAVE_DEPLOYMENT_CACHE = True
HF_HUB_ENABLE_HF_TRANSFER = 1
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ sh install.sh
5. Run the SDK.

```bash
CUDA_VISIBLE_DEVICES=0 poetry run aana --port 8000 --host 0.0.0.0 --target chat_with_video
HF_HUB_ENABLE_HF_TRANSFER=1 CUDA_VISIBLE_DEVICES=0 poetry run aana --port 8000 --host 0.0.0.0 --target chat_with_video
```

The target parameter specifies the set of endpoints to deploy.
Expand Down Expand Up @@ -211,4 +211,5 @@ Here are the environment variables that can be used to configure the Aaana SDK:
- NUM_WORKERS: The number of request workers. Default: `2`.
- DB_CONFIG: The database configuration in the format `{"datastore_type": "sqlite", "datastore_config": {"path": "/path/to/sqlite.db"}}`. Currently only SQLite and PostgreSQL are supported. Default: `{"datastore_type": "sqlite", "datastore_config": {"path": "/var/lib/aana_data"}}`.
- USE_DEPLOYMENT_CACHE (testing only): If set to `true`, the tests will use the deployment cache to avoid downloading the models and running the deployments. Default: `false`.
- SAVE_DEPLOYMENT_CACHE (testing only): If set to `true`, the tests will save the deployment cache after running the deployments. Default: `false`.
- SAVE_DEPLOYMENT_CACHE (testing only): If set to `true`, the tests will save the deployment cache after running the deployments. Default: `false`.
- HF_HUB_ENABLE_HF_TRANSFER: If set to `1`, the HuggingFace Transformers will use the HF Transfer library to download the models from HuggingFace Hub to speed up the process. Recommended to always set to it `1`. Default: `0`.
76 changes: 74 additions & 2 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ deepdiff = "^6.7.0"
diffusers = "^0.23.1"
fastapi = "^0.104.0"
faster-whisper = ">=0.10.0"
hf-transfer = "^0.1.6"
onnxruntime = "1.16.1"
opencv-python = "^4.8.1.78"
portpicker = "^1.6.0"
Expand Down
2 changes: 1 addition & 1 deletion startup.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#!/bin/bash
# TODO: pass arguments to the docker to set target instead of environment variable
poetry run aana --port 8000 --host 0.0.0.0 --target $TARGET
HF_HUB_ENABLE_HF_TRANSFER=1 poetry run aana --port 8000 --host 0.0.0.0 --target $TARGET

0 comments on commit d918ba9

Please sign in to comment.