Skip to content

Commit

Permalink
Allow overriding the binding generator and work on a better CROSS_COM…
Browse files Browse the repository at this point in the history
…PILE.md (#1010)

make pgx-pg-sys build script use the path to the directory containing the (extracted) target info using a secret envar `PGX_TARGET_INFO_PATH_PGxx`

Also adds a new cargo-pgx subcommand: `cargo pgx cross pgx-target` which creates a tarball of generated bindings (and other things) from the host.
  • Loading branch information
thomcc authored Jan 15, 2023
1 parent 04641d6 commit 8bdb725
Show file tree
Hide file tree
Showing 8 changed files with 295 additions and 10 deletions.
74 changes: 69 additions & 5 deletions CROSS_COMPILE.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Cross compiling `pgx`
# Cross compiling `pgx` (on Linux)

*Warning: this guide is still a work in progress!*

Expand All @@ -8,20 +8,84 @@ Note that guide is fairly preliminary and does not cover many cases, most notabl

1. This does not (yet) cover cross compiling with `cargo pgx` (planned). Note that this means this documentation may only be useful to a small set of users.

2. This is assuming that you are cross compiling between `x86_64-unknown-linux-gnu` and `aarch64-unknown-linux-gnu` (either direction works). Compiling to other targets will likely be similar, but are left as an exercise for the reader.
2. Cross-compiling the `cshim` is not (yet?) supported. You should ensure that the `pgx/cshim` cargo feature is disabled when you perform the cross-build.

3. Cross-compiling the `cshim` is possible but difficult and not fully documented here. You should ensure that the `pgx/cshim` is disabled when you perform the cross-build.
3. This guide is assuming that you are cross compiling between `x86_64-unknown-linux-gnu` and `aarch64-unknown-linux-gnu` (either direction works).

- Compiling between other architectures (but still Linux on both ends) will likely be similar, but are left as an exercise for the reader.
- Compiling between other OSes is... considerably more difficult, and not officially supported at the moment (although it is possible with quite a bit of pain).
- Non-`gnu` linux targets have not been investigated at all (this mostly applies to `-musl` targets like Alpine, as I doubt PG will work well under uclibc).

Other cases are encouraged to investigate Docker, QEMU, and so on -- it's honestly probably the easier path for a lot of cases.

An additional caveat (one that is unrelated to the document's completeness) is that the cross-compilation process is highly specific to your Linux distribution, unfortunately. This guide covers Debian (where it's easy) and Fedora (where it's hard). The Debian steps should apply to any `apt`-using distribution, and most of the Fedora steps will apply to other distributions with minimal conversion.

# System reqirements

1. Postgres headers configured for the target (which we'll talk about shortly).
2. A cross compilation toolchain (which we'll also talk about shortly).
3. Everything you'd need to build things on the host (which we won't talk about, but install it).
4. A Rust toolchain for the target (which we also won't talk about).

## Getting PostgreSQL headers for the target
[getting-headers]: #getting-postgresql-headers-for-the-target

To generate the bindings for `pgx-pg-sys` we need the PostgreSQL headers. Unfortunately, a few of these headers get generated when Postgres is compiled and contain non-portable information, which means that we can't just use the headers for the version of postgres you have installed. You need to get the server headers from a build PostgreSQL done for the cross-compile target. The version of PostgreSQL *must* match.

*Caveat: It is _critically important_ that you ensure the headers you get are for the correct version of Postgres. Failure to do so will _probably not_ be detected at build time, and will almost certainly result in an extension that fails _terribly_ at runtime. Do not mess this up!*

To minimize the chance for compatibility issues: The headers should be produced by a PostgreSQL build as close as possible to the one which will actually use the cross-compiled extension. That means they should be built with the same configuration flags, and ideally for the same Linux distribution (yes, really[^distro]). You probably also need to try to ensure that glibc and linux kernel versions match what you use for the cross-compile toolchain in the next step (an exact match would be ideal, but it is difficult on anything other than Debian).

[^distro]: Comments on PostgreSQL's mailing list [indicate](https://www.postgresql.org/message-id/1556.1486012839%40sss.pgh.pa.us) that they consider it fine if there are ABI differences between extensions built on (for example) Debian versus RHEL. In reality this seems unlikely as long as things like kernel and glibc versions are compatible (but don't blame me if that isn't true), but either way, it's an indication that you should go out of your way to ensure *everything* on the build machine and cross-compilation environment is *as similar as possible* to what it will be in the location where the extension will ultimately get installed.

One additional warning is that while currently `pgx` doesn't emit bindings for headers which include (transitively or otherwise) headers provided by third-party dependencies, those do exist (PostgreSQL has headers which transitively include in headers from ICU, and from LLVM's C API). At the moment we don't need them, but future versions of PG could change that, so be aware. If those headers become necessary, it's very likely that the failure would be at compile time rather than runtime, so this is *mostly* something that doesn't need to be handled yet (but might be necessary when dealing with patched versions of postgres).

### Where to get the headers
[get-headers-options]: #where-to-get-the-headers

Anyway, there are a few ways you can go about getting these headers. This is not exhaustive, although some alternatives have good reasons for being left out (for example, attempting to cross-compile PostgreSQL itself to get headers is only a good idea if you're actually going to use that cross-compiled PostgreSQL):

1. If you built the version of PostgreSQL you're planning on using for the cross-compile target yourself (or a coworker you have access to did it), then use the headers from that build. They should be

While you're at it, mark down the versions of glibc and the Linux kernel that they were built against, as that will be useful for figuring out the cross-compilation toolchain.

2. A simple approach that is unlikely to go wrong is to copy them off of an actual machine of that architecture running the same Postgres configuration you care about. This is a good option for cases like using [PL/Rust](https://github.com/tcdi/plrust)'s cross-compilation and replication support simultaneously, but it may not always be possible.

Concretely, you want to use the headers from the folder displayed when you run `pg_config --includedir-server`. Make sure to keep track of the version output by `pg_config --version` (and make sure it matches what you expect).

3. If you know that you're planning on using Postgres installed from a package repository, you can grab the headers from there. This is a good option if you're cross-compiling an extension on the same version of the same Linux distribution, and intend to use the same package repositories to install Postgres in both cases.

For example, on Debian something like `apt download postgresql-server-dev-$VERSION:arm64` (note that Debian uses `arm64` for `aarch64` targets, and `amd64` for `x86_64` targets) will produce a `postgresql-server-dev-$VERSION_..._arm64.deb` file in the current working directory, and extracting the headers from it (while out of scope of this guide) is hopefully straightforward. This requires that your distribution package binaries your chosen target architecture (`arm64` in the example above), which is not guaranteed (while Debian itself is quite solid about providing binaries for a several architectures, this is not true for every apt-based distribution).

Similarly, the incantation on Fedora is something like `dnf download --forcearch aarch64 postgresql-server-devel`, which puts an a `.rpm` in the current working directory. RPM is somewhat more involved to extract (use `rpm2cpio` and `cpio` in conjunction), but entirely possible. Be very aware about the Postgres version these headers come from, since Fedora (for example) does not seem to have packages for multiple versions of PostgreSQL.

And so on, that should give you ideas.

Some dubious advice: In practice, it's possible that small differences (using a different distro's package, or using PG that was configured slightly differently) will not matter much for your extension. In some cases, it's even possible that using headers which were configured for a different architecture is won't cause issues for you -- it may even be the right choice in some situations (for example, if production builds aren't cross compiled, so the mismatch only exists during development I'd be included to say it's only a problem if it causes one).

That said, you should be well-aware of the risk when deviating from this, and I would try to get it right for anything you're going to use in production.

## Getting a cross-compilation toolchain
[get-toolchain]: #getting-a-cross-compilation-toolchain

First off, if you're on Debian (and probably other `apt`-using distributions) this should be pretty easy -- just `sudo apt install crossbuild-essential-arm64` if you want to cross-compile *to* aarch64 (*from* x86_64), and `sudo apt install crossbuild-essential-amd64` if you want to cross-compile *to* x86_64 (*from* aarch64). If that's you, you're done with this step, and can move on to the next section.

Otherwise, you're going to have to do this the hard way, which means providing a cross-compile toolchain. Unfortunately, there are some extra wrinkles, as we need bindings generated to be as identical as possible to the the ones that will be generated in non-cross settings.

In particular, this means you should try to ensure that the version of glibc and the linux headers for the cross-compilation toolchain are... reasonably close to reality.

FIXME finish rewriting the rest of this

# Distributions

Unfortunately, the cross-compilation process is quite distribution specific. We'll cover two cases:

1. Debian-based distributions, where this is very easy.
1. Debian-based distributions, where this is not that bad.
2. Distributions where userspace cross-compilation is not directly supported (such as the Fedora-family). This is much more difficult, so if you have a choice you should not go this route.

## Debian

Of the mainstream distributions (that is, excluding things like NixOS which probably do also make this easy) the easiest path available is likely to be on Debian-family systems. This is for two reasons:
Of the mainstream distributions (that is, excluding things like NixOS which apparently are designed to make this easy) the easiest path available is likely to be on Debian-family systems. This is for two reasons:

1. The cross compilation tools can be installed via an easy package like `crossbuild-essential-arm64` (when targetting `aarch64`) or `crossbuild-essential-amd64` (when targetting `x86_64`)

Expand Down
36 changes: 36 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 4 additions & 1 deletion cargo-pgx/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ edition = "2021"
atty = "0.2.14"
cargo_metadata = "0.15.2"
cargo_toml = "0.11.8"
clap = { version = "4.0.32", features = [ "env", "suggestions", "cargo", "derive" ] }
clap = { version = "4.0.32", features = [ "env", "suggestions", "cargo", "derive", "wrap_help" ] }
clap-cargo = { version = "0.10.0", features = [ "cargo_metadata" ] }
semver = "1.0.16"
owo-colors = { version = "3.5.0", features = [ "supports-colors" ] }
Expand Down Expand Up @@ -46,6 +46,9 @@ color-eyre = "0.6.2"
tracing = "0.1.37"
tracing-error = "0.2.0"
tracing-subscriber = { version = "0.3.16", features = [ "env-filter" ] }
flate2 = { version = "1.0.25", default-features = false, features = ["rust_backend"] }
tar = { version = "0.4.38", default-features = false }
tempfile = "3.3.0"

[features]
default = ["ureq/native-tls"]
Expand Down
40 changes: 40 additions & 0 deletions cargo-pgx/src/command/cross/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
/*
Portions Copyright 2019-2021 ZomboDB, LLC.
Portions Copyright 2021-2023 Technology Concepts & Design, Inc. <[email protected]>
All rights reserved.
Use of this source code is governed by the MIT license that can be found in the LICENSE file.
*/

use crate::CommandExecute;
pub(crate) mod pgx_target;

/// Commands having to do with cross-compilation. (Experimental)
#[derive(clap::Args, Debug)]
#[clap(about, author)]
pub(crate) struct Cross {
#[command(subcommand)]
pub(crate) subcommand: CargoPgxCrossSubCommands,
}

impl CommandExecute for Cross {
fn execute(self) -> eyre::Result<()> {
self.subcommand.execute()
}
}

/// Subcommands relevant to cross-compilation.
#[derive(clap::Subcommand, Debug)]
pub(crate) enum CargoPgxCrossSubCommands {
PgxTarget(pgx_target::PgxTarget),
}

impl CommandExecute for CargoPgxCrossSubCommands {
fn execute(self) -> eyre::Result<()> {
use CargoPgxCrossSubCommands::*;
match self {
PgxTarget(target_info) => target_info.execute(),
}
}
}
118 changes: 118 additions & 0 deletions cargo-pgx/src/command/cross/pgx_target.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
/*
Portions Copyright 2019-2021 ZomboDB, LLC.
Portions Copyright 2021-2023 Technology Concepts & Design, Inc. <[email protected]>
All rights reserved.
Use of this source code is governed by the MIT license that can be found in the LICENSE file.
*/
use crate::CommandExecute;
use eyre::{eyre, Result, WrapErr};
use pgx_pg_config::PgConfig;
use std::{
path::{Path, PathBuf},
process::{Command, Stdio},
};

/// Build and output a PGX target bundle into the current directory.
///
/// This file is a tarball containing information which can be used to help
/// build PG extensions compatible with this machine's PostgreSQL installation.
/// It is optional, but recommended for most cases (any case where the host and
/// target are not identical versions of Debian).
///
/// See the documentation in `CROSS_COMPILE.md` in <https://github.com/tcdi/pgx>
/// for specifics of this file format and how to use the resulting file. Note
/// that this is currently unlikely to be useful on non-Linux targets, as pgx
/// does not yet support cross-compilation on those targets.
#[derive(clap::Args, Debug)]
pub(crate) struct PgxTarget {
/// The `pg_config` path (default is the first `pg_config` in "$PATH").
///
/// Caveat: Running this against PostgreSQL installations placed in
/// `~/.pgx/$pgver/` by `cargo pgx init` is probably a mistake in most cases.
#[arg(long, short = 'c', value_parser)]
pub pg_config: Option<PathBuf>,

/// Output filename. Defaults to `pgx-target.$target_arch.tgz`
#[arg(long, short = 'o', value_parser)]
pub output: Option<PathBuf>,

/// Override the `pgx-pg-sys` dependency (used to generate bindings). By
/// default we use a version of pgx-pg-sys which has the same same version
/// as the `cargo-pgx` binary.
#[arg(long, value_parser)]
pub pg_sys_path: Option<PathBuf>,

/// The PostgreSQL major version that is needed. We will error if the
/// provided `pg_config` is not that version.
#[arg(long, short = 'P', value_parser)]
pub pg_version: Option<u16>,
}

impl CommandExecute for PgxTarget {
fn execute(self) -> eyre::Result<()> {
let temp = tempfile::tempdir()?;
make_target_info(&self, temp.path())?;
temp.close()?;
Ok(())
}
}

#[tracing::instrument(level = "error")]
fn make_target_info(cmd: &PgxTarget, tmp: &Path) -> Result<()> {
let pg_config_path = cmd.pg_config.clone().unwrap_or_else(|| "pg_config".into()).to_owned();
let pg_config = PgConfig::new_with_defaults(pg_config_path.clone());

let major_version = pg_config.major_version()?;
if let Some(expected_pg_version) = cmd.pg_version {
eyre::ensure!(
major_version == expected_pg_version,
"the provided `pg_config` had the wrong major version",
);
}

run(Command::new("cargo").args(["init", "--lib", "--name", "temp-crate"]).current_dir(tmp))?;

let cargo_add: Vec<String> = if let Some(pg_sys_path) = &cmd.pg_sys_path {
let abs = pg_sys_path.canonicalize().wrap_err_with(|| {
format!("given `--pg-sys-path` could not be canonicalized: {pg_sys_path:?}")
})?;
vec!["pgx-pg-sys".into(), "--path".into(), abs.display().to_string()]
} else {
let own_version = env!("CARGO_PKG_VERSION");
vec![format!("pgx-pg-sys@={own_version}")]
};

run(Command::new("cargo")
.arg("add")
.args(cargo_add)
.arg("--no-default-features")
.current_dir(tmp))?;

let filename = format!("pg{major_version}_raw_bindings.rs");
run(Command::new("cargo")
.current_dir(tmp)
.arg("build")
.arg("--features")
.arg(format!("pgx-pg-sys/pg{major_version}"))
.env("PGX_PG_CONFIG_PATH", &pg_config_path)
.env("PGX_PG_SYS_EXTRA_OUTPUT_PATH", &tmp.join(&filename)))?;

run(Command::new("rustfmt").current_dir(tmp).arg(&filename))?;
run(Command::new("tar").current_dir(tmp).arg("czf").arg("out.tgz").arg(&filename))?;
std::fs::rename(tmp.join("out.tgz"), format!("pgx-target.{}.tgz", std::env::consts::ARCH))?;

Ok(())
}

#[tracing::instrument(level = "info", fields(command = ?c), err)]
fn run(c: &mut Command) -> Result<()> {
c.stdout(Stdio::inherit()).stderr(Stdio::inherit());
let status = c.status().wrap_err("Unable to create temporary crate")?;
if !status.success() {
Err(eyre!("{:?} failed with exit code: {}", c, status))
} else {
Ok(())
}
}
1 change: 1 addition & 0 deletions cargo-pgx/src/command/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Use of this source code is governed by the MIT license that can be found in the
*/

pub(crate) mod connect;
pub(crate) mod cross;
pub(crate) mod get;
pub(crate) mod init;
pub(crate) mod install;
Expand Down
2 changes: 2 additions & 0 deletions cargo-pgx/src/command/pgx.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ enum CargoPgxSubCommands {
Connect(super::connect::Connect),
Test(super::test::Test),
Get(super::get::Get),
Cross(super::cross::Cross),
}

impl CommandExecute for CargoPgxSubCommands {
Expand All @@ -58,6 +59,7 @@ impl CommandExecute for CargoPgxSubCommands {
Connect(c) => c.execute(),
Test(c) => c.execute(),
Get(c) => c.execute(),
Cross(c) => c.execute(),
}
}
}
Expand Down
Loading

0 comments on commit 8bdb725

Please sign in to comment.