-
-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement layout="zip" for Lambda/GCF, skipping lambdex #19022
Conversation
|
||
|
||
@dataclass(frozen=True) | ||
class PexVenv: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this name will be confusing with VenvPex
, but I lost inspiration, let me know if you have a better name.
pex_request = PexFromTargetsRequest( | ||
addresses=[request.address], | ||
internal_only=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels unnecessarily heavyweight, and having to materialise the full non-internal PEX just to throw it away is presumably a major reason why this new approach is slower (since I imagine there's duplicated work between "build a PEX" and "turn that PEX into a venv"), related to the discussion in the PR.
However, this needs to be built for a potentially-foreign platform, and hence, AIUI, internal_only=True
and VenvPex
don't work.
2c191e3
to
8a19f67
Compare
Hooray! After this I think we should deprecate lambdex and get rid of it. Will review asap. |
8a19f67
to
8087903
Compare
# The AWS-facing handler function is always lambdex_handler.handler, which is the | ||
# wrapper injected by lambdex that manages invocation of the actual handler. | ||
handler_log_message="lambdex_handler.handler", | ||
# this isn't built or run via lambdex, but the name is arbitrary, so we stick with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good opportunity to get rid of this name, since we're churning anyway. It'll be harder to do later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But also, are we sure we want this at all? Why not use the handler name the user configured directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But also, are we sure we want this at all? Why not use the handler name the user configured directly?
Yes, I think so. The BUILD
file target must contain the entry point (at least the file/module part of entry_point="path/to.py:handler"
), in order to compute dependencies/build the package, and it's annoying to have to carefully reproduce that elsewhere (handler: 'parent.modules.path.to.handler'
in templates/IaC code). Synthesising a fixed module avoids that, and is all info pants already knows.
For instance, in our repo, we use AWS CDK, and have a generic PythonLambda
construct where the basic "get the code running" config is just pointing it to the package (e.g. foo.bar/target.zip
), because it can just set handler: 'lambdex_handler.handler'
globally, not requiring customisation per lambda. I think this is really nice!
(There's obviously often per-lambda config required like any extra env vars or permissions, but CDK/the infra configuration is the source of truth there, it's not just directly duplicated from the BUILD
files.)
I think this is a good opportunity to get rid of this name, since we're churning anyway. It'll be harder to do later.
Sure, I'm happy to choose another name, e.g. some options:
lambda_function
: https://docs.aws.amazon.com/lambda/latest/dg/python-handler.html suggests AWS configureslambda_function.lambda_handler
by default, when creating a Python function using the consoleindex
: using "inline code" in AWS's CDK usesindex
as the file name (https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_lambda.Function.html#handler), but... thehandler
property always needs to be setpython_awslambda
,pants_python_awslambda
or__pants_lambda
or some other pants-y name- something else?
I think we can also make this customisable in future (including the ability to turn it off), e.g. synthetic_handler_module="my_custom_name"
or synthetic_handler_module=None
, if required/desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone with lambda_function
, but more than happy to change. (Thought of another option too: main
to be consistent with GCF.)
…)" This reverts commit 2bc078b.
) Reverts #19022. Per #19032 (comment), we're going to change the approach for moving away from Lambdex to a global setting, rather than the per-target `layout` field added in #19022. That change will simplify things, and involve an extra moving part that will be cherry-picked to 2.17, so I think it's better to start from scratch, and then more carefully cherry-pick the parts of #19022 that are relevant.
This fixes #18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in #19074. In #19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (#19027) - adjust documentation #19067 - other improvements like #18195 and #18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of #19022 with a simpler approach to deprecation, as discussed in #19074 (comment) and #19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
This fixes #18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in #19074. In #19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (#19027) - adjust documentation #19067 - other improvements like #18195 and #18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of #19022 with a simpler approach to deprecation, as discussed in #19074 (comment) and #19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
…d#19076) This fixes pantsbuild#18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in pantsbuild#19074. In pantsbuild#19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (pantsbuild#19027) - adjust documentation pantsbuild#19067 - other improvements like pantsbuild#18195 and pantsbuild#18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of pantsbuild#19022 with a simpler approach to deprecation, as discussed in pantsbuild#19074 (comment) and pantsbuild#19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
…ck of #19076) (#19120) This fixes #18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in #19074. In #19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (#19027) - adjust documentation #19067 - other improvements like #18195 and #18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of #19022 with a simpler approach to deprecation, as discussed in #19074 (comment) and #19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
There's several places that check whether a PEX transitively depends on any `files()` targets, because those are (currently) not included in the PEX. This moves the warning into the "pex from targets" builder, to not have to reimplement it everywhere. This is follow up to #19022, and restores the warning for `layout="zip"`, since I didn't implement it there.
There's several places that check whether a PEX transitively depends on any `files()` targets, because those are (currently) not included in the PEX. This moves the warning into the "pex from targets" builder, to not have to reimplement it everywhere. This is follow up to #19022, and restores the warning for `layout="zip"`, since I didn't implement it there.
…19148) There's several places that check whether a PEX transitively depends on any `files()` targets, because those are (currently) not included in the PEX. This moves the warning into the "pex from targets" builder, to not have to reimplement it everywhere. This is follow up to #19022, and restores the warning for `layout="zip"`, since I didn't implement it there. Co-authored-by: Huon Wilson <[email protected]>
This fixes #18879 by allowing the
python_awslambda
andpython_google_cloud_function
FaaS artefacts to be generated in "simple" format, using thepex3 venv create --layout=flat-zipped
functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level, e.g. the zip containscowsay/__init__.py
etc., rather than.deps/cowsay-....whl
(plus the dynamic PEX initialisation).This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can:
.bootstrap/
folder seems to be about 2MB uncompressed, ~1MB compressed).For instance, for one Python 3.9 Lambda in our codebase:
(The PEX-postprocessing time metric is specifically the time to run the
Setting up handler
(lambdex) orBuild python_awslambda
(pex3 venv create
) process, computed by runningpants --keep-sandboxes=always package ...
for each layout, and thenhyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh
. This doesn't include the time to construct the input PEX, which is the same for both.)This functionality is driven by adding a new
layout
field. It defaults tolambdex
(retaining the current code paths), but also supportszip
, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start.This PR's commits can be reviewed independently. It comes in three phases:
pex_venv.py
util rules for runningpex3 venv create ...
. Currently this only supports a limited subset of the functionality there, but can presumably be expanded freely as required. (First commit)I think this is an acceptable MVP for this functionality, but there's various bits of follow-up:
layout="lambdex"
(in favour oflayout="zip"
and/or normalpex_binary
) (Deprecate lambdex for building FaaS packages (pending) #19032)files
being loaded into these packages, which has been temporarily lost (Warn aboutfiles()
in PEXes in one place #19027)complete_platforms
for serverless/FaaS environmentsruntime
#18195 and Allow building AWS Lambda Layers #18880pex3 venv create ...
could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first