-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not hash absolute sysroot path into stdlib crates metadata. #14951
Conversation
An alternative approach I tried is making |
8f07586
to
f7665c0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a definitive answer here. Overall the more reproducibility the better. OTOH, it is also a niche use case like this one, and relies on how the unstable -Zbuild-std
work underneath.
// to pull crates from anywhere without worrying about conflicts. | ||
unit.pkg | ||
.package_id() | ||
.stable_hash(bcx.ws.root()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another approach to this: if a unit is from -Zbuild-std
, pass its workspace root path to stable_hash
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's cleaner yes! I've changed the code to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry but I got paranoid that this could be applied to the same logic, for consistency.
cargo/src/cargo/core/compiler/build_runner/compilation_files.rs
Lines 203 to 206 in 99dff6d
pub fn target_short_hash(&self, unit: &Unit) -> String { | |
let hashable = unit.pkg.package_id().stable_hash(self.ws.root()); | |
util::short_hash(&(METADATA_VERSION, hashable)) | |
} |
(It doesn't really matter though. The hash is for target directory layout for fingerprint and build-script output…)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind. It needs more tweaks and seems meaningless.
stdlib crates get a SourceId which has an absolute path pointing into the sysroot. This makes the metadata hash change depending on where you've installed Rust. This is causing problems because the different hashes snowball into the optimizer making different decisions which ends up changing the binary size. (Some context: at work we're working with embedded devices with little flash storage so it often happens that a binary builds locally and then fails to fit in flash in CI, just because CI has installed rustc to a different path. Improving binary size is *not* a goal of this PR, after the fix the size will be whatever, but at least it won't change based on the rustc path anymore) Overview of the fix: - For libstd crates, the metadata hash now contains the path relative to the sysroot, instead of the absolute path. - The absolute path is still hashed into the fingerprint (not the metadata) so moving the rustc installation triggers a rebuild. This ensures stdlib crates are rebuilt when upgrading nightly versions. - The rustc version is still hashed into the metadata as usual, so upgrading Rust releases (not nightly versions) does cause a metadata change. Repro of the bug: ``` $ git clone https://github.com/embassy-rs/embassy --branch cargo-nondet-repro $ cd embassy/ $ cd examples/nrf52840 $ RUSTUP_HOME=~/.rustup1 cargo build --release --bin wifi_esp_hosted .... Finished `release` profile [optimized + debuginfo] target(s) in 13.33s $ llvm-size target/thumbv7em-none-eabi/release/wifi_esp_hosted text data bss dec hex filename 114500 80 48116 162696 27b88 target/thumbv7em-none-eabi/release/wifi_esp_hosted $ RUSTUP_HOME=~/.rustup2 cargo build --release --bin wifi_esp_hosted .... Finished `release` profile [optimized + debuginfo] target(s) in 9.64s $ llvm-size target/humbv7em-none-eabi/release/wifi_esp_hosted text data bss dec hex filename 114272 80 48116 162468 27aa4 target/thumbv7em-none-eabi/release/wifi_esp_hosted ```
f7665c0
to
a2e63e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dirbaio
I am about to merge this. Could you update the PR description a bit to reflect the latest change? Thank you!
// SourceId for stdlib crates is an absolute path inside the sysroot. | ||
// Pass the sysroot as workspace root so that we hash a relative path. | ||
// This avoids the metadata hash changing depending on where the user installed rustc. | ||
&bcx.target_data.get_info(unit.kind).unwrap().sysroot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note for future self: sysroot is something like
/home/user/.rustup/toolchains/nightly-aarch64-apple-darwin
It is not the real workspace root path of the standard library.
While this seems a bit out of sync, the relative path inside sysroot should be fairly deterministic. It's fine now and I don't think we need any hardcoded path logic like this.
done! |
Update cargo 10 commits in 99dff6d77db779716dda9ca3b29c26addd02c1be..652623b779c88fe44afede28bf7f1c9c07812511 2024-12-18 00:55:17 +0000 to 2024-12-20 15:44:42 +0000 - fix(package): use relpath to cwd for vcs dirtiness report (rust-lang/cargo#14970) - Enable triagebot merge conflict notifications (rust-lang/cargo#14972) - fixed the error message for a user to open the crate (rust-lang/cargo#14969) - fix(package): show dirty filepaths relative to git workdir (rust-lang/cargo#14968) - Add the `test` cfg as a well known cfg before of compiler change (rust-lang/cargo#14963) - refactor(cargo-package): let-else to flatten code (rust-lang/cargo#14959) - fix(cargo-package): add more traces (rust-lang/cargo#14960) - Do not hash absolute sysroot path into stdlib crates metadata. (rust-lang/cargo#14951) - docs: add missing argument to Rustup Cargo workaround (rust-lang/cargo#14954) - fix(cargo-rustc): stabilize higher precedence trailing flags (rust-lang/cargo#14900)
stdlib crates get a SourceId which has an absolute path pointing into the sysroot. This makes the metadata hash change depending on where you've installed Rust. This is causing problems because the different hashes snowball into the optimizer making different decisions which ends up changing the binary size.
(Some context: at work we're working with embedded devices with little flash storage so it often happens that a binary builds locally and then fails to fit in flash in CI, just because CI has installed rustc to a different path. Improving binary size is not a goal of this PR, after the fix the size will be whatever, but at least it won't change based on the rustc path anymore)
Overview of the fix:
Repro of the bug: