Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Git LFS support to uv-git crate #10335

Merged
merged 9 commits into from
Jan 13, 2025
Merged

Conversation

sydduckworth
Copy link
Contributor

@sydduckworth sydduckworth commented Jan 6, 2025

Summary

Closes #3312.

This PR adds Git LFS support to the uv-git crate by using the git-lfs CLI to fetch required LFS objects for a revision following the call to git fetch.

The LFS fetch step is disabled by default and only enabled if the environment variable UV_GIT_LFS is set.

When enabled, the LFS fetch step is run for all repositories regardless of whether they have associated LFS objects. The step is skipped if the git-lfs CLI tool isn't installed.

Test Plan

I verified that the minimal example in the linked issue passes, i.e. this command now succeeds:

UV_GIT_LFS=1 uv pip install git+https://github.com/grebnetiew/lfs-py.git

I also verified that non-LFS repositories still work, with or without git-lfs installed.

To Replicate

Attempt to use uv to install a Git dependency that contains LFS objects (e.g. uv pip install git+https://github.com/grebnetiew/lfs-py.git). This should fail with a smudge filter error.

Re-run the same command with the added environment variable UV_GIT_LFS=1. The install should now succeed.

Potential Changes / Improvements

With this change LFS objects in a given revision will always be downloaded if the user has Git LFS installed, which may not always be desired behavior. It might be helpful to add a field to the uv settings and/or an environment variable so that the LFS step can be disabled if needed.

Enabling/disabled via environment variable has now been implemented.

- Added a `git lfs fetch` step after fetching the repository but before cloning it locally.
- Git LFS step is ignored if `git-lfs` is not installed.
- Errors while executing Git LFS are reported as normal.
/// We search for the Git LFS binary instead of using `git lfs` so that we can
/// distinguish Git LFS not being installed (which we can ignore) from
/// LFS errors (which should be returned).
static GIT_LFS: LazyLock<Option<PathBuf>> = LazyLock::new(|| which::which("git-lfs").ok());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not certain this detection method is portable to all end user systems. For example, I can have git lfs installed and working but I not have git-lfs binary on the path. I think probing git lfs directly is likely the most portable approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've updated the logic so now instead of searching for git-lfs on the path, it just tries to run git lfs version the first time LFS is invoked and uses the success of that command as an indicator of LFS availability.

I'd definitely be interested in if anyone has a cleaner way of determining whether LFS is available. I was initially going to just check if the LFS fetch command returned 127 (command not found), but that return code isn't available in Powershell, so that approach isn't portable across shells.

Now instead of checking for a `git-lfs` binary on the path, we just try to run `git lfs version` to verify LFS is available.
@sydduckworth sydduckworth requested a review from samypr100 January 9, 2025 15:43
@samypr100
Copy link
Collaborator

With this change LFS objects in a given revision will always be downloaded if the user has Git LFS installed, which may not always be desired behavior. It might be helpful to add a field to the uv settings and/or an environment variable so that the LFS step can be disabled if needed.

Agree, since this is technically a possible breaking change a way to opt-in would be great. I think an env var can be used here to trigger the behavior change as its relatively simple. I'd rather have this be opt-in than opt-out as fetching lfs objects may not always be desired, particularly on large downloads.

Thoughts @zanieb?

sydduckworth and others added 2 commits January 12, 2025 12:18
- Added `UV_GIT_LFS` environment variable
- Moved LFS fetch step so it is now gated behind `UV_GIT_LFS` flag.
@sydduckworth
Copy link
Contributor Author

@samypr100 I've updated the PR so that the LFS fetch step is off by default, and is enabled by setting the environment variable UV_GIT_LFS.

@zanieb
Copy link
Member

zanieb commented Jan 12, 2025

Opt-in is a great idea — it makes us much more likely to accept the feature.

@zanieb zanieb self-requested a review January 12, 2025 18:09
@zanieb zanieb self-assigned this Jan 12, 2025
@zanieb
Copy link
Member

zanieb commented Jan 12, 2025

Did you do any benchmarking of this?

@sydduckworth
Copy link
Contributor Author

Testing on my local system with a repository with a small number of tracked files (git+https://github.com/astral-test/uv-public-pypackage), I couldn't produce a statistically significant difference in execution time for uv pip install with or without LFS support enabled.

That said, the added overhead should be equal to the time required to run git lfs version once (4.6 ms on my system) plus the time required to run git lfs fetch for each repository (15 ms for the test repo on my system). Execution time of git lfs fetch scales with number of tracked files but even testing with larger repositories I got about the same result.
So the predicted overhead (on my machine) with LFS enabled (assuming the command interacts with at least one Git repository) would be about 5 ms + about 15 ms per Git repository.

@samypr100
Copy link
Collaborator

samypr100 commented Jan 12, 2025

You might be able to eliminate the 5ms overhead by moving the env var check before the lfs check nevermind, I hadn't seen the latest code.

@zanieb zanieb enabled auto-merge (squash) January 13, 2025 18:21
@zanieb zanieb merged commit 97c1877 into astral-sh:main Jan 13, 2025
64 checks passed
tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request Jan 22, 2025
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.5.15` -> `0.5.22` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>astral-sh/uv (astral-sh/uv)</summary>

### [`v0.5.22`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0522)

[Compare Source](astral-sh/uv@0.5.21...0.5.22)

##### Enhancements

-   Include version and contact information in GitHub User Agent ([#&#8203;10785](astral-sh/uv#10785))

##### Performance

-   Add fast-path for recursive extras in dynamic validation ([#&#8203;10823](astral-sh/uv#10823))
-   Fetch `pyproject.toml` from GitHub API ([#&#8203;10765](astral-sh/uv#10765))
-   Remove allocation in Git SHA truncation ([#&#8203;10801](astral-sh/uv#10801))
-   Skip GitHub fast path when full commit is already known ([#&#8203;10800](astral-sh/uv#10800))

##### Bug fixes

-   Add fallback to build backend when `Requires-Dist` mismatches ([#&#8203;10797](astral-sh/uv#10797))
-   Avoid deserialization error for paths above the root ([#&#8203;10789](astral-sh/uv#10789))
-   Avoid respecting preferences from other indexes ([#&#8203;10782](astral-sh/uv#10782))
-   Disable the distutils setuptools shim during interpreter query ([#&#8203;10819](astral-sh/uv#10819))
-   Omit variant when detecting compatible Python installs ([#&#8203;10722](astral-sh/uv#10722))
-   Remove TOCTOU errors in Git clone ([#&#8203;10758](astral-sh/uv#10758))
-   Validate metadata under GitHub fast path ([#&#8203;10796](astral-sh/uv#10796))
-   Include conflict markers in fork markers ([#&#8203;10818](astral-sh/uv#10818))

##### Error messages

-   Add tag incompatibility hints to sync failures ([#&#8203;10739](astral-sh/uv#10739))
-   Improve log when distutils is missing ([#&#8203;10713](astral-sh/uv#10713))
-   Show non-critical Python discovery errors if no other interpreter is found ([#&#8203;10716](astral-sh/uv#10716))
-   Use colors for lock errors ([#&#8203;10736](astral-sh/uv#10736))

##### Documentation

-   Add testing instructions to the AWS Lambda guide ([#&#8203;10805](astral-sh/uv#10805))

### [`v0.5.21`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0521)

[Compare Source](astral-sh/uv@0.5.20...0.5.21)

##### Enhancements

-   Avoid building dynamic versions when validating lockfile ([#&#8203;10703](astral-sh/uv#10703))

##### Configuration

-   Add `UV_VENV_SEED` environment variable ([#&#8203;10715](astral-sh/uv#10715))

##### Performance

-   Store unsupported tags in wheel filename ([#&#8203;10665](astral-sh/uv#10665))

##### Bug fixes

-   Avoid attempting to patch macOS dylib for non-macOS installs ([#&#8203;10721](astral-sh/uv#10721))
-   Avoid narrowing `requires-python` marker with disjunctions ([#&#8203;10704](astral-sh/uv#10704))
-   Respect environment variable credentials for indexes outside root ([#&#8203;10688](astral-sh/uv#10688))
-   Respect preferences for explicit index dependencies from `requirements.txt` ([#&#8203;10690](astral-sh/uv#10690))
-   Sort preferences by environment, then index ([#&#8203;10700](astral-sh/uv#10700))
-   Ignore permission errors when looking for user-level configuration file ([#&#8203;10697](astral-sh/uv#10697))

##### Documentation

-   Add `SyntaxWarning` compatibility note to bytecode compilation docs ([#&#8203;10701](astral-sh/uv#10701))
-   Add `MACOSX_DEPLOYMENT_TARGET` to the `--python-platform` documentation ([#&#8203;10698](astral-sh/uv#10698))

### [`v0.5.20`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0520)

[Compare Source](astral-sh/uv@0.5.19...0.5.20)

##### Bug fixes

-   Avoid failing when deserializing unknown tags ([#&#8203;10655](astral-sh/uv#10655))

### [`v0.5.19`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0519)

[Compare Source](astral-sh/uv@0.5.18...0.5.19)

##### Enhancements

-   Filter wheels from lockfile based on architecture ([#&#8203;10584](astral-sh/uv#10584))
-   Omit dynamic versions from the lockfile ([#&#8203;10622](astral-sh/uv#10622))
-   Add support for `pip freeze --path` ([#&#8203;10488](astral-sh/uv#10488))
-   Reduce verbosity of inline-metadata message when using `uv run <script.py>` ([#&#8203;10588](astral-sh/uv#10588))
-   Add opt-in Git LFS support ([#&#8203;10335](astral-sh/uv#10335))
-   Recommend `--native-tls` on SSL errors ([#&#8203;10605](astral-sh/uv#10605))
-   Show expected and available ABI tags in resolver errors ([#&#8203;10527](astral-sh/uv#10527))
-   Show target Python version in error messages ([#&#8203;10582](astral-sh/uv#10582))
-   Add `--output-format=json` support to `uv python list` ([#&#8203;10596](astral-sh/uv#10596))

##### Python

The managed Python distributions have been updated, including:

-   Python 3.14 support on Windows
-   Python 3.14.0a4 support
-   64-bit RISC-V Linux support
-   Bundled `libedit` updated from [`2021091`](https://github.com/astral-sh/uv/commit/20210910)-3.1 -> [`2024080`](https://github.com/astral-sh/uv/commit/20240808)-3.1
-   Bundled `tcl/tk` updated from 8.6.12 -> 8.6.14 (for all Python versions on Unix, only for Python 3.14 on Windows)

See the [`python-build-standalone` release notes](https://github.com/astral-sh/python-build-standalone/releases/tag/20250115) for more details.

##### Performance

-   Avoid allocating when stripping source distribution extension ([#&#8203;10625](astral-sh/uv#10625))
-   Reduce `WheelFilename` to 48 bytes ([#&#8203;10583](astral-sh/uv#10583))
-   Reduce distribution size to 200 bytes ([#&#8203;10601](astral-sh/uv#10601))
-   Remove `import re` from entrypoint wrapper scripts ([#&#8203;10627](astral-sh/uv#10627))
-   Shrink size of platform tag enum ([#&#8203;10546](astral-sh/uv#10546))
-   Use `ArcStr` in verbatim URL ([#&#8203;10600](astral-sh/uv#10600))
-   Use `memchr` for wheel parsing ([#&#8203;10620](astral-sh/uv#10620))

##### Bug fixes

-   Avoid reading symlinks during `uv python install` on Windows ([#&#8203;10639](astral-sh/uv#10639))
-   Correct Pyston tag format ([#&#8203;10580](astral-sh/uv#10580))
-   Provide `pyproject.toml` path for parse errors in `uv venv` ([#&#8203;10553](astral-sh/uv#10553))
-   Don't treat `setuptools` and `wheel` as seed packages in uv sync on Python 3.12 ([#&#8203;10572](astral-sh/uv#10572))
-   Fix git-tag cache-key reader in case of slashes ([#&#8203;10467](astral-sh/uv#10467)) ([#&#8203;10500](astral-sh/uv#10500))
-   Include build tag in rendered wheel filenames ([#&#8203;10599](astral-sh/uv#10599))
-   Patch embedded install path for Python dylib on macOS during `python install` ([#&#8203;10629](astral-sh/uv#10629))
-   Read cached registry distributions when `--config-settings` are present ([#&#8203;10578](astral-sh/uv#10578))
-   Show resolver hints for packages with markers ([#&#8203;10607](astral-sh/uv#10607))

##### Documentation

-   Add meta titles to documents in guides, excluding integration documents ([#&#8203;10539](astral-sh/uv#10539))
-   Remove `build-system` from example workspace rot ([#&#8203;10636](astral-sh/uv#10636))

##### Preview features

-   Make build backend type annotations more generic ([#&#8203;10549](astral-sh/uv#10549))

### [`v0.5.18`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0518)

[Compare Source](astral-sh/uv@0.5.17...0.5.18)

##### Bug fixes

-   Avoid forking for identical markers ([#&#8203;10490](astral-sh/uv#10490))
-   Avoid panic in `uv remove` when only comments exist ([#&#8203;10484](astral-sh/uv#10484))
-   Revert "improve shell compatibility of venv activate scripts ([#&#8203;10397](astral-sh/uv#10397))" ([#&#8203;10497](astral-sh/uv#10497))

### [`v0.5.17`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0517)

[Compare Source](astral-sh/uv@0.5.16...0.5.17)

This release includes support for generating lockfiles from scripts based on inline metadata, as defined in PEP 723.

By default, scripts remain unlocked, and must be locked explicitly with `uv lock --script /path/to/script.py`, which
will generate a lockfile adjacent to the script (e.g., `script.py.lock`). Once generated, the lockfile will be
respected (and updated, if necessary) across `uv run --script`, `uv add --script`, and `uv remove --script` invocations.

This release also includes support for `uv export --script` and `uv tree --script`. Both commands support PEP 723
scripts with and without accompanying lockfiles.

##### Enhancements

-   Add support for locking PEP 723 scripts ([#&#8203;10135](astral-sh/uv#10135))
-   Respect PEP 723 script lockfiles in `uv run` ([#&#8203;10136](astral-sh/uv#10136))
-   Update PEP 723 lockfile in `uv add --script` ([#&#8203;10145](astral-sh/uv#10145))
-   Update PEP 723 lockfile in `uv remove --script` ([#&#8203;10162](astral-sh/uv#10162))
-   Add `--script` support to `uv export` for PEP 723 scripts ([#&#8203;10160](astral-sh/uv#10160))
-   Add `--script` support to `uv tree` for PEP 723 scripts ([#&#8203;10159](astral-sh/uv#10159))
-   Add `ls` alias to `uv {tool, python, pip} list` ([#&#8203;10240](astral-sh/uv#10240))
-   Allow reading `--with-requirements` from stdin in `uv add` and `uv run` ([#&#8203;10447](astral-sh/uv#10447))
-   Warn-and-ignore for unsupported `requirements.txt` options ([#&#8203;10420](astral-sh/uv#10420))

##### Preview features

-   Add remaining Python type annotations to build backend ([#&#8203;10434](astral-sh/uv#10434))

##### Performance

-   Avoid allocating for names in the PEP 508 parser ([#&#8203;10476](astral-sh/uv#10476))
-   Fetch concurrently for non-first-match index strategies ([#&#8203;10432](astral-sh/uv#10432))
-   Remove unnecessary `.to_string()` call ([#&#8203;10419](astral-sh/uv#10419))
-   Respect sentinels in package prioritization ([#&#8203;10443](astral-sh/uv#10443))
-   Use `ArcStr` for marker values ([#&#8203;10453](astral-sh/uv#10453))
-   Use `ArcStr` for package, extra, and group names ([#&#8203;10475](astral-sh/uv#10475))
-   Use `matches!` rather than `contains` in `requirements.txt` parsing ([#&#8203;10423](astral-sh/uv#10423))
-   Use faster disjointness check for markers ([#&#8203;10439](astral-sh/uv#10439))
-   Pre-compute PEP 508 markers from universal markers ([#&#8203;10472](astral-sh/uv#10472))

##### Bug fixes

-   Fix `UV_FIND_LINKS` delimiter to split on commas ([#&#8203;10477](astral-sh/uv#10477))
-   Improve `uv tool list` output when tool environment is broken ([#&#8203;10409](astral-sh/uv#10409))
-   Only track markers for compatible versions ([#&#8203;10457](astral-sh/uv#10457))
-   Respect `requires-python` when installing tools ([#&#8203;10401](astral-sh/uv#10401))
-   Visit proxy packages eagerly ([#&#8203;10441](astral-sh/uv#10441))
-   Improve shell compatibility of `venv` activate scripts ([#&#8203;10397](astral-sh/uv#10397))
-   Read publish username from URL ([#&#8203;10469](astral-sh/uv#10469))

##### Documentation

-   Add Lambda layer instructions to AWS Lambda guide ([#&#8203;10411](astral-sh/uv#10411))
-   Add `uv lock --script` to the docs ([#&#8203;10414](astral-sh/uv#10414))
-   Use Windows-specific instructions in Jupyter guide ([#&#8203;10446](astral-sh/uv#10446))

### [`v0.5.16`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0516)

[Compare Source](astral-sh/uv@0.5.15...0.5.16)

##### Enhancements

-   Accept full requirements in `uv remove` ([#&#8203;10338](astral-sh/uv#10338))

##### Performance

-   Avoid over-counting versions in batch prefetcher ([#&#8203;10350](astral-sh/uv#10350))
-   Deactivate tracing for version-choosing ([#&#8203;10351](astral-sh/uv#10351))
-   Force a niche into `VersionSmall` ([#&#8203;10385](astral-sh/uv#10385))
-   Optimize `requirements_for_extra` ([#&#8203;10348](astral-sh/uv#10348))
-   Re-enable `zlib-ng` on x86 platforms ([#&#8203;10365](astral-sh/uv#10365))
-   Re-enable zlib-ng on all platforms (except s390x, PowerPC, and FreeBSD) ([#&#8203;10370](astral-sh/uv#10370))
-   Remove `[u64; 4]` from small version to move `Arc` to full version ([#&#8203;10345](astral-sh/uv#10345))
-   Shrink `Dist` from 352 to 288 bytes ([#&#8203;10389](astral-sh/uv#10389))
-   Speed up file pins by removing nested hash map ([#&#8203;10346](astral-sh/uv#10346))
-   Buffer file reads in `serde_json::from_reader` ([#&#8203;10341](astral-sh/uv#10341))

##### Bug fixes

-   Avoid enforcing project-level required version for `uv self` ([#&#8203;10374](astral-sh/uv#10374))
-   Fix Ruff linting warnings from generated template files for extension modules ([#&#8203;10371](astral-sh/uv#10371))

##### Documentation

-   Add AWS Lambda integration guide ([#&#8203;10278](astral-sh/uv#10278))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS45My4wIiwidXBkYXRlZEluVmVyIjoiMzkuMTE4LjUiLCJ0YXJnZXRCcmFuY2giOiJtYWluIiwibGFiZWxzIjpbIlJlbm92YXRlIEJvdCJdfQ==-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

uv doesn't correctly checkout Git dependencies with Git LFS assets
3 participants