Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update PyTorch to 2.4.1 #268

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,11 @@ package-dir = { "" = "src" }

[tool.setuptools.dynamic]
dependencies = { file = ["requirements.txt"] }
optional-dependencies.cuda = { file = ["requirements-cuda.txt"] }
optional-dependencies.rocm = { file = ["requirements-rocm.txt"] }
optional-dependencies.cpu = { file = ["requirements/cpu.txt"] }
optional-dependencies.cuda = { file = ["requirements/cuda.txt"] }
optional-dependencies.hpu = { file = ["requirements/hpu.txt"] }
optional-dependencies.mps = { file = ["requirements/mps.txt"] }
optional-dependencies.rocm = { file = ["requirements/rocm.txt"] }

[tool.setuptools.packages.find]
where = ["src"]
Expand Down
1 change: 0 additions & 1 deletion requirements-rocm.txt

This file was deleted.

2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ pyyaml
py-cpuinfo
# we set this to be above 0a0 so that it doesn't
# replace custom pytorch images with the 2.3.0
torch>=2.3.0a0
torch>=2.3.0,<2.5.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this min version here be bumped to 2.4.1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Gaudi 1.18 is supported, yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gaudi software 1.18.0 has 2.4.0a0. The 1.18.x series will not get a newer version.

Why are you changing the version range for Torch? I don't see any changes to Python code or other dependencies. There is no apparent reason why instructlab-training should no longer work with Torch 2.3.x or 2.5.x.

I would prefer to keep the version ranges for dependencies of instructlab-training as open as possible and only restrict upper versions in instructlab package.

transformers>=4.45.2
accelerate>=0.34.2
datasets>=2.15.0
Expand Down
1 change: 1 addition & 0 deletions requirements/cpu.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Extra dependencies for CPU-only
1 change: 1 addition & 0 deletions requirements-cuda.txt → requirements/cuda.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
# Extra dependencies for NVIDIA CUDA
flash-attn>=2.4.0
bitsandbytes>=0.43.1
11 changes: 11 additions & 0 deletions requirements/hpu.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# # Extra dependencies for Intel Gaudi / Habana Labs HPU devices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Christian's PR merged and changed these things as well, I don't want to overwrite whatever he updated. Would it be possible to check which is the correct source of truth?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep requirements/hpu.txt empty. There is no need to restrict torch here or to pull any of the habana extensions. instructlab.training does not import them.

# Habana Labs 1.17.1 has PyTorch 2.3.1a0+gitxxx pre-release
torch>=2.3.1a0,<2.4.0
# Habana Labs frameworks
habana-torch-plugin>=1.17.1
habana_gpu_migration>=1.17.1
# additional Habana Labs packages (installed, but not used)
#habana-media-loader
#habana-pyhlml
#habana_quantization_toolkit
#habana-torch-dataloader
1 change: 1 addition & 0 deletions requirements/mps.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Extra dependencies for Apple MPS (Metal Performance Shaders)
2 changes: 2 additions & 0 deletions requirements/rocm.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Extra dependencies for AMD ROCm
flash-attn>=2.6.2,<2.7.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason to limit the upper bound of flash-attn?