-
-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add future lib and port src/base #6067
Conversation
…ompatibility. See http://python-future.org/overview.html for features and more context.
|
||
|
||
@total_ordering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implements all 6 comparison dunders, given __eq__
and __lt__
.
There is a slight performance penalty. If Revision
is used a lot, I can manually add the other comparisons.
|
||
def __ne__(self, other): | ||
return not self.__eq__(other) | ||
if not self._is_valid_operand(other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This slightly changes the behavior from before. When comparing to non-Revisions, this returns NotImplemented
, which is the expected behavior. instead of the prior False
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any python3 compat reason to mix in this behavior change? As an example of future conversion PRs, it would be good to stick to pure semantic-preserving automated conversion if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I made the change is for the@total_ordering
decorator, so that we treat __eq__
and __lt__
more similarly. I believe I can revert back to original semantics if you prefer, though.
The only Py3 change we need is replacing __cmp__
with __lt__
or one of the other rich comparison functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer it if just to keep semantics changes out of the conversion PRs. Especially since Revision
is :API: public
and the semantic shift is False
-> raise. We can't know how this will break API users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More on :API: public
here: https://www.pantsbuild.org/deprecation_policy.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted __eq__
to return False
But am keeping __lt__
as returning NotImplemented
. The prior implementation in __cmp__
would return AttributeError
if other._components
is not defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I think I can buy the AttributeError
case as not being part of the public api, so this change is probably OK. That said - I'd avoid that sort of improvement in the conversion PRs in particular going forward. Broken out as separate changes is a different story, that is definitely great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and the semantic shift is False -> raise.
I was mistaken there. The semantic shift is subtle to non-existant depending on how other
implements __eq__
. That said, still a fan of nixing improvements in these PRs. TODO is ok, but it really should be attributed and / or - better - assigned an issue with the issue link noted in the TODO. Your spirit of wanting to fix is admirable and I'm sure you'll see more of these. Perhaps just start 1 umbrella issue to capture all the small TODO's you find then you or someone else can circle back and do the fixes in unrelated PR's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I reverted everything back to identical semantics from before (just more explicit).
I think it's a good policy to avoid any changes in these PRs because they get buried. Instead, will add notes in form TODO(python3port)
and open an issue at #6071.
|
||
def __ne__(self, other): | ||
return not self.__eq__(other) | ||
if not self._is_valid_operand(other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any python3 compat reason to mix in this behavior change? As an example of future conversion PRs, it would be good to stick to pure semantic-preserving automated conversion if possible.
src/python/pants/base/mustache.py
Outdated
@@ -34,7 +35,7 @@ def convert_val(x): | |||
return [convert_val(e) for e in x] | |||
else: | |||
return x | |||
items = [(key, convert_val(val)) for (key, val) in args.items()] | |||
items = [(key, convert_val(val)) for (key, val) in list(args.items())] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess these are automated, but they are unfortunate. An Iterable is enough to build a list comprehension and so wrapping in a list is a waste of resources and extra un-needed code noise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. We can manually try to identify places like this if you think it's worth it, although will slow us down and may be error prone knowing when and when not to change back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reading over this guide and I think we should actually get rid of this call to list()
when it's clear it's not necessary - it leads to less performant code in both languages.
If it's unclear if I can safely remove it, I'll leave it for us to review in PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can definitely safely remove it in this context - ie: immediate consumption of the Iterable. That said, I loathe scaning for these in the conversion PRs - does not scale and we'll likely mess up. It would be awesome if the tool was better or configurable to avoid this case. Presumably it uses an AST and not just string searches. If it does use an AST, a super-nice contribution would be expanding about the dict.{items,keys,values} call to see if the context is the generator of a for loop, and if so - do not wrap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsirois I've been reviewing the changes already - not that much extra work for me to remove list()
when it's used in for loops. Will make your lives easier with PRs also - less changes.
Maintainers stopped updating futurize
, 2to3
, etc last year, so PR wouldn't go far there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending green ci
src/python/pants/base/worker_pool.py
Outdated
@@ -103,7 +105,7 @@ def error(e): | |||
|
|||
# We filter out Nones defensively. There shouldn't be any, but if a bug causes one, | |||
# Pants might hang indefinitely without this filtering. | |||
work_iter = iter(filter(None, work_chain)) | |||
work_iter = iter([_f for _f in work_chain if _f]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be just work_iter = (_f for _f in work_chain if _f)
; ie directly construct an iter (generator) vs create a list and then ask it for an iter. If this change was automated however, I'm fine with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fix it. If we don't change it now, it will probably never get changed and might be confusing to later devs why we're using this roundabout approach.
I do agree with you that we should be more rigorous though with keeping 100% semantic parity (as possible). But for small things like removing extraneous list()
, I think it's worth fixing what futurize
generates.
|
||
def __ne__(self, other): | ||
return not self.__eq__(other) | ||
if not self._is_valid_operand(other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and the semantic shift is False -> raise.
I was mistaken there. The semantic shift is subtle to non-existant depending on how other
implements __eq__
. That said, still a fan of nixing improvements in these PRs. TODO is ok, but it really should be attributed and / or - better - assigned an issue with the issue link noted in the TODO. Your spirit of wanting to fix is admirable and I'm sure you'll see more of these. Perhaps just start 1 umbrella issue to capture all the small TODO's you find then you or someone else can circle back and do the fixes in unrelated PR's.
@@ -354,7 +355,9 @@ def generate_generated(config, here): | |||
|
|||
def render_html(dst, config, soups, precomputed, template): | |||
soup = soups[dst] | |||
renderer = Renderer() | |||
renderer = Renderer(escape=cgi.escape) # TODO(python3port): cgi.escape is deprecated in favor html.escape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a tough bug to find. Pystache by default uses cgi.escape()
in Python2, but html.escape()
in Python3. Because future
adds a backport to html
(which doesn't exist in Python2), Pystache successfully was importing the module and using html.escape()
.
html.escape()
escapes '
unlike cgi.escape()
, so one of our tests was failing.
For now, I'm having us stay with the old semantics.
cc @mateor @iantabolt @stuhood @kwlzn @benjyw How do you prefer I proceed with future porting pull requests? @jsirois and I worked out some good practices for what should and should not be included in these PRs, but we need to decide how big we want them to be. Should I submit a module at a time, or all at once, or something else? |
=John on breaking the improvements as distinct from py3-required changes. A benefit to doing that is that it makes the more procedural commits easier to review. I think that testing in a single module makes sense for a first pass. Personally, I might try doing the entire codebase locally, and think about breaking out and landing required logic changes, then landing the remainder as a blob. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Module at a time would definitely be preferable to doing the whole repository at once. If this were something like a simple style change, that would be one thing. But it seems quite likely that there will be actual bugs caused/uncovered by this, and smaller PRs will make bisecting feasible. Keeping behaviour changes isolated into separate PRs would be ideal. Thanks! |
Description
Implements stage 2 of porting plan, which is adding the future library.
Also ports the
src/pants/base
andtests/python/pants_test/base
as a demo of how this tool works.At this stage, we aren't testing things yet with the Python 3 interpreter, even though technically it's Python 3 code. The goal for now is to make sure things still work on Python 2.
FYI - how the backports work
The
future
library adds several modules:future
,builtins
,past
, and individual ports likequeue
(instead ofQueue
).These added names all correspond exactly to Python 3 code. When running Python 3, it will simply use the native Python 3 version. When using Python 2, it will use the library implementation, which is semantically equivalent to Python 3.
FYI - using old
moves
interfaceTypically
futurize
will try to use this idiom for certain backportsThis however does not work in our codebase, because all imports must appear above code with iSort, so the
install_aliases()
line gets moved below the relevant imports.Instead, we are using the legacy
future.moves
library. Once we remove Python 2 support we'll want to convert this type of importfrom future.moves.itertools import x
tofrom itertools import x
.Testing
I ran the tests for everything with the normal Python 2 interpreter - is that enough, or should I hand test any behavior?
Open question - remaining pull requests
The next stage is to run this tool on all remaining source code, which will lead to a huge pull request. How should I split this up? Do you prefer everything being together, one folder at a time, or something else?
I'm using this spreadsheet to track progress.
All of these changes will be generated by futurize, unless otherwise noted where I have to manually fix things. I'm restraining myself from refactors etc to minimize complexity.
FYI - script to automate running tool
I made this helper script to automate the 5 steps it takes to port a file with
futurize
. It's not very robust but you can use to experiment with this tool.