Add future lib and port src/base #6067

Eric-Arellano · 2018-07-05T15:36:41Z

Description

Implements stage 2 of porting plan, which is adding the future library.

Also ports the src/pants/base and tests/python/pants_test/base as a demo of how this tool works.

At this stage, we aren't testing things yet with the Python 3 interpreter, even though technically it's Python 3 code. The goal for now is to make sure things still work on Python 2.

FYI - how the backports work

The future library adds several modules: future, builtins, past, and individual ports like queue (instead of Queue).

These added names all correspond exactly to Python 3 code. When running Python 3, it will simply use the native Python 3 version. When using Python 2, it will use the library implementation, which is semantically equivalent to Python 3.

FYI - using old `moves` interface

Typically futurize will try to use this idiom for certain backports

from future import standard_library
standard_library.install_aliases()

import [backports]

This however does not work in our codebase, because all imports must appear above code with iSort, so the install_aliases() line gets moved below the relevant imports.

Instead, we are using the legacy future.moves library. Once we remove Python 2 support we'll want to convert this type of import from future.moves.itertools import x to from itertools import x.

Testing

I ran the tests for everything with the normal Python 2 interpreter - is that enough, or should I hand test any behavior?

Open question - remaining pull requests

The next stage is to run this tool on all remaining source code, which will lead to a huge pull request. How should I split this up? Do you prefer everything being together, one folder at a time, or something else?

I'm using this spreadsheet to track progress.

All of these changes will be generated by futurize, unless otherwise noted where I have to manually fix things. I'm restraining myself from refactors etc to minimize complexity.

FYI - script to automate running tool

I made this helper script to automate the 5 steps it takes to port a file with futurize. It's not very robust but you can use to experiment with this tool.

…ompatibility. See http://python-future.org/overview.html for features and more context.

Eric-Arellano · 2018-07-05T16:02:38Z

src/python/pants/base/revision.py


+
+@total_ordering


This implements all 6 comparison dunders, given __eq__ and __lt__.

There is a slight performance penalty. If Revision is used a lot, I can manually add the other comparisons.

Eric-Arellano · 2018-07-05T16:04:22Z

src/python/pants/base/revision.py

-
-  def __ne__(self, other):
-    return not self.__eq__(other)
+    if not self._is_valid_operand(other):


This slightly changes the behavior from before. When comparing to non-Revisions, this returns NotImplemented, which is the expected behavior. instead of the prior False.

Is there any python3 compat reason to mix in this behavior change? As an example of future conversion PRs, it would be good to stick to pure semantic-preserving automated conversion if possible.

The reason I made the change is for the@total_ordering decorator, so that we treat __eq__ and __lt__ more similarly. I believe I can revert back to original semantics if you prefer, though.

The only Py3 change we need is replacing __cmp__ with __lt__ or one of the other rich comparison functions.

I'd prefer it if just to keep semantics changes out of the conversion PRs. Especially since Revision is :API: public and the semantic shift is False -> raise. We can't know how this will break API users.

More on :API: public here: https://www.pantsbuild.org/deprecation_policy.html

I reverted __eq__ to return False

But am keeping __lt__ as returning NotImplemented. The prior implementation in __cmp__ would return AttributeError if other._components is not defined.

OK. I think I can buy the AttributeError case as not being part of the public api, so this change is probably OK. That said - I'd avoid that sort of improvement in the conversion PRs in particular going forward. Broken out as separate changes is a different story, that is definitely great.

... and the semantic shift is False -> raise.

I was mistaken there. The semantic shift is subtle to non-existant depending on how other implements __eq__. That said, still a fan of nixing improvements in these PRs. TODO is ok, but it really should be attributed and / or - better - assigned an issue with the issue link noted in the TODO. Your spirit of wanting to fix is admirable and I'm sure you'll see more of these. Perhaps just start 1 umbrella issue to capture all the small TODO's you find then you or someone else can circle back and do the fixes in unrelated PR's.

Okay I reverted everything back to identical semantics from before (just more explicit).

I think it's a good policy to avoid any changes in these PRs because they get buried. Instead, will add notes in form TODO(python3port) and open an issue at #6071.

jsirois · 2018-07-05T16:17:09Z

src/python/pants/base/revision.py

-
-  def __ne__(self, other):
-    return not self.__eq__(other)
+    if not self._is_valid_operand(other):


Is there any python3 compat reason to mix in this behavior change? As an example of future conversion PRs, it would be good to stick to pure semantic-preserving automated conversion if possible.

jsirois · 2018-07-05T16:19:07Z

src/python/pants/base/mustache.py

@@ -34,7 +35,7 @@ def convert_val(x):
        return [convert_val(e) for e in x]
      else:
        return x
-    items = [(key, convert_val(val)) for (key, val) in args.items()]
+    items = [(key, convert_val(val)) for (key, val) in list(args.items())]


I guess these are automated, but they are unfortunate. An Iterable is enough to build a list comprehension and so wrapping in a list is a waste of resources and extra un-needed code noise.

Agreed. We can manually try to identify places like this if you think it's worth it, although will slow us down and may be error prone knowing when and when not to change back.

I'm reading over this guide and I think we should actually get rid of this call to list() when it's clear it's not necessary - it leads to less performant code in both languages.

If it's unclear if I can safely remove it, I'll leave it for us to review in PR.

You can definitely safely remove it in this context - ie: immediate consumption of the Iterable. That said, I loathe scaning for these in the conversion PRs - does not scale and we'll likely mess up. It would be awesome if the tool was better or configurable to avoid this case. Presumably it uses an AST and not just string searches. If it does use an AST, a super-nice contribution would be expanding about the dict.{items,keys,values} call to see if the context is the generator of a for loop, and if so - do not wrap.

@jsirois I've been reviewing the changes already - not that much extra work for me to remove list() when it's used in for loops. Will make your lives easier with PRs also - less changes.

Maintainers stopped updating futurize, 2to3, etc last year, so PR wouldn't go far there.

jsirois

LGTM pending green ci

jsirois · 2018-07-05T23:36:30Z

src/python/pants/base/worker_pool.py

@@ -103,7 +105,7 @@ def error(e):

    # We filter out Nones defensively. There shouldn't be any, but if a bug causes one,
    # Pants might hang indefinitely without this filtering.
-    work_iter = iter(filter(None, work_chain))
+    work_iter = iter([_f for _f in work_chain if _f])


This could be just work_iter = (_f for _f in work_chain if _f); ie directly construct an iter (generator) vs create a list and then ask it for an iter. If this change was automated however, I'm fine with it.

I'll fix it. If we don't change it now, it will probably never get changed and might be confusing to later devs why we're using this roundabout approach.

I do agree with you that we should be more rigorous though with keeping 100% semantic parity (as possible). But for small things like removing extraneous list(), I think it's worth fixing what futurize generates.

jsirois · 2018-07-05T23:42:12Z

src/python/pants/base/revision.py

-
-  def __ne__(self, other):
-    return not self.__eq__(other)
+    if not self._is_valid_operand(other):


... and the semantic shift is False -> raise.

I was mistaken there. The semantic shift is subtle to non-existant depending on how other implements __eq__. That said, still a fan of nixing improvements in these PRs. TODO is ok, but it really should be attributed and / or - better - assigned an issue with the issue link noted in the TODO. Your spirit of wanting to fix is admirable and I'm sure you'll see more of these. Perhaps just start 1 umbrella issue to capture all the small TODO's you find then you or someone else can circle back and do the fixes in unrelated PR's.

…tion.

Eric-Arellano · 2018-07-06T13:19:01Z

pants-plugins/src/python/internal_backend/sitegen/tasks/sitegen.py

@@ -354,7 +355,9 @@ def generate_generated(config, here):

 def render_html(dst, config, soups, precomputed, template):
  soup = soups[dst]
-  renderer = Renderer()
+  renderer = Renderer(escape=cgi.escape)  # TODO(python3port): cgi.escape is deprecated in favor html.escape


This was a tough bug to find. Pystache by default uses cgi.escape() in Python2, but html.escape() in Python3. Because future adds a backport to html (which doesn't exist in Python2), Pystache successfully was importing the module and using html.escape().

html.escape() escapes ' unlike cgi.escape(), so one of our tests was failing.

For now, I'm having us stay with the old semantics.

Eric-Arellano · 2018-07-06T13:25:57Z

cc @mateor @iantabolt @stuhood @kwlzn @benjyw

How do you prefer I proceed with future porting pull requests? @jsirois and I worked out some good practices for what should and should not be included in these PRs, but we need to decide how big we want them to be.

Should I submit a module at a time, or all at once, or something else?

mateor · 2018-07-06T13:47:43Z

=John on breaking the improvements as distinct from py3-required changes.

A benefit to doing that is that it makes the more procedural commits easier to review.

I think that testing in a single module makes sense for a first pass. Personally, I might try doing the entire codebase locally, and think about breaking out and landing required logic changes, then landing the remainder as a blob.

stuhood

Thanks!

stuhood · 2018-07-06T15:23:28Z

Module at a time would definitely be preferable to doing the whole repository at once. If this were something like a simple style change, that would be one thing. But it seems quite likely that there will be actual bugs caused/uncovered by this, and smaller PRs will make bisecting feasible.

Keeping behaviour changes isolated into separate PRs would be ideal. Thanks!

Eric Arellano added 2 commits July 5, 2018 10:25

Add future and configparser libraries to dependencies for python2-3 c…

31167f9

…ompatibility. See http://python-future.org/overview.html for features and more context.

Port src/base to Python3.

06b0db0

Eric-Arellano mentioned this pull request Jul 5, 2018

Add future library to dependencies for python2-3 compatibility. #6064

Closed

Eric-Arellano commented Jul 5, 2018

View reviewed changes

jsirois reviewed Jul 5, 2018

View reviewed changes

Eric Arellano added 3 commits July 5, 2018 18:31

Remove unnecessary calls to list()

0c0e8f7

Revert to prior semantics of Revision __eq__

ce2216b

Missed removing an unnecessary list

77ae991

jsirois approved these changes Jul 5, 2018

View reviewed changes

Eric Arellano added 3 commits July 6, 2018 08:59

Fix failing sitegen test due to changes in standard lib's escape func…

7f07f94

…tion.

Fully restore prior semantics with revision.py and improve TODOs

8bc62a4

Simplify generator expression

70d2547

Eric-Arellano commented Jul 6, 2018

View reviewed changes

stuhood approved these changes Jul 6, 2018

View reviewed changes

stuhood merged commit f783edc into pantsbuild:master Jul 6, 2018

Eric-Arellano deleted the add-future-lib branch July 6, 2018 15:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add future lib and port src/base #6067

Add future lib and port src/base #6067

Eric-Arellano commented Jul 5, 2018

Eric-Arellano Jul 5, 2018

Eric-Arellano Jul 5, 2018

jsirois Jul 5, 2018

Eric-Arellano Jul 5, 2018

jsirois Jul 5, 2018

jsirois Jul 5, 2018

Eric-Arellano Jul 5, 2018 •

edited

Loading

jsirois Jul 5, 2018

jsirois Jul 5, 2018

Eric-Arellano Jul 6, 2018

jsirois Jul 5, 2018

jsirois Jul 5, 2018

Eric-Arellano Jul 5, 2018

Eric-Arellano Jul 5, 2018

jsirois Jul 5, 2018

Eric-Arellano Jul 5, 2018

jsirois left a comment

jsirois Jul 5, 2018

Eric-Arellano Jul 6, 2018

jsirois Jul 5, 2018

Eric-Arellano Jul 6, 2018

Eric-Arellano commented Jul 6, 2018

mateor commented Jul 6, 2018

stuhood left a comment

stuhood commented Jul 6, 2018



		@total_ordering

Add future lib and port src/base #6067

Add future lib and port src/base #6067

Conversation

Eric-Arellano commented Jul 5, 2018

Description

FYI - how the backports work

FYI - using old moves interface

Testing

Open question - remaining pull requests

FYI - script to automate running tool

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Eric-Arellano Jul 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsirois left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Eric-Arellano commented Jul 6, 2018

mateor commented Jul 6, 2018

stuhood left a comment

Choose a reason for hiding this comment

stuhood commented Jul 6, 2018

FYI - using old `moves` interface

Eric-Arellano Jul 5, 2018 •

edited

Loading