Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ruff] update and vendor annotate-snippets #15359

Merged
merged 23 commits into from
Jan 15, 2025
Merged

Conversation

BurntSushi
Copy link
Member

@BurntSushi BurntSushi commented Jan 8, 2025

This PR upgrades to the latest version of
annotate-snippets
and vendors it. The motivating reason for vendoring the
crate is because it makes a non-elidable change to the
diagnostic header
. (Note that
I've submitted a PR
to help us work-around that issue, but it might not be
a good fit for upstream.) Moreover,
while annotate-snippets is dense, it's not very big. And
I think it makes sense for us to maintain our own copy,
at least for now, so that we can be masters of our own
destiny over a critical component of Ruff.

For example, our vendored copy now contains two bug fixes
to annotate-snippets that I sent to upstream, but we
can benefit from the fixes now. See rust-lang/annotate-snippets-rs#169
and rust-lang/annotate-snippets-rs#170.

Reviewers should look at this PR commit-by-commit. The
unusually large number of commits is a result of dividing
the snapshot changes (over 1,000 of them) into different
categories. I did this so that I could more easily review
the changes and ensure they are correct and/or desirable.
In some cases, they are bug fixes. In other cases, they
are just stylistic improvements. In yet other cases, they
flagged bugs in annotate-snippets themselves. While the
number of commits is large, I actually believe this will
make review much easier and quicker.

Closes #10832, Ref #11618

@BurntSushi
Copy link
Member Author

BurntSushi commented Jan 8, 2025

Bah, a bunch of Clippy lints are firing on the vendored copy of annotate-snippets. I'm considering just squashing all of them in order to reduce the diff with upstream, but that might be a non-goal.

EDIT: This is what I ended up doing.

Copy link
Contributor

github-actions bot commented Jan 8, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

ℹ️ ecosystem check encountered format errors. (no format changes; 1 project error)

openai/openai-cookbook (error)

warning: Detected debug build without --no-cache.
error: Failed to read examples/Assistants_API_overview_python.ipynb: Expected a Jupyter Notebook, which must be internally stored as JSON, but this file isn't valid JSON: expected `,` or `]` at line 197 column 8

Formatter (preview)

ℹ️ ecosystem check encountered format errors. (no format changes; 1 project error)

openai/openai-cookbook (error)

ruff format --preview

warning: Detected debug build without --no-cache.
error: Failed to read examples/Assistants_API_overview_python.ipynb: Expected a Jupyter Notebook, which must be internally stored as JSON, but this file isn't valid JSON: expected `,` or `]` at line 197 column 8

@BurntSushi BurntSushi force-pushed the ag/upgrade-annotate-snippets branch 2 times, most recently from 1175d5e to 572632e Compare January 8, 2025 20:54
crates/ruff_linter/src/message/text.rs Outdated Show resolved Hide resolved
crates/ruff_linter/src/message/text.rs Show resolved Hide resolved
crates/ruff_linter/src/message/text.rs Outdated Show resolved Hide resolved
crates/ruff_linter/src/message/text.rs Outdated Show resolved Hide resolved
@Gankra
Copy link

Gankra commented Jan 8, 2025

partial review status: i've "approved" all the commits that aren't test:

@MichaReiser MichaReiser added the do-not-merge Do not merge this pull request label Jan 9, 2025
@MichaReiser
Copy link
Member

Adding "do-not-merge" until the 0.9 release is out

@BurntSushi BurntSushi changed the title update and vendor annotate-snippets [ruff] update and vendor annotate-snippets Jan 9, 2025
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. Thank you and thanks a ton for splitting the commits as you did. It allowed me to skip over a large chunk of the snapshot changes because I knew they're all whitespace (and github's UI is pain because it requires clicking load changes for many of them).

I think there are a few cases where the snapshots now are incorrect and may require changes to the rule or rendering. The truncation itself is nice, but I think we can't use ... because it's a valid syntax element in Python.

crates/ruff_python_parser/tests/fixtures.rs Show resolved Hide resolved
crates/ruff_linter/src/message/text.rs Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this commit. I think that all changes here are undesired other than for I001. We should double check what the ranges are but most of the rules try to render a marker right at the start of a line with an empty range.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are unfortunately turning out to be extremely difficult to fix. I was able to fix the E112 regressions by tweaking the spans generated by that specific lint, but some of the other non-I001 changes in this commit are harder to fix. For example, the syntax errors seem slightly off. The issue is that the spans are generated by the Python parser, and I'm not sure I want to dig into tweaking those. So if I change annotate-snippets to render the position after the line terminator to be at the start of the following line instead of the end of the preceding line, it ends up regressing a bunch of the other bug fixes in this PR.

I haven't looked at the D207 errors yet, but I suspect I'll be able to fix that by tweaking the spans like I did for E112.

So I think that overall leaves a few of the syntax error diagnostics (it doesn't appear to be many) that might wind up a little more confusing than the status quo, but still not outright wrong I think. Thoughts on me filing an issue for that but otherwise moving on?

@sharkdp
Copy link
Contributor

sharkdp commented Jan 9, 2025

and github's UI is pain because it requires clicking load changes for many of them

@MichaReiser I have a "Load diffs" bookmark in my browsers toolbar which does this automatically 😄

javascript:document.querySelectorAll('.load-diff-button').forEach(node => node.click())

@AlexWaygood AlexWaygood removed the do-not-merge Do not merge this pull request label Jan 9, 2025
@BurntSushi BurntSushi force-pushed the ag/upgrade-annotate-snippets branch from 1c92833 to bea42de Compare January 9, 2025 18:50
@BurntSushi
Copy link
Member Author

BurntSushi commented Jan 9, 2025

I've updated the PR with some rewritten history and addressed a good portion of the feedback so far. What's remaining is some of the thornier problems with some of the new rendering. My plan is to investigate those, figure out their root causes and figure out how to fix them. I don't think a re-review is helpful yet. I'll ping y'all when a re-review makes more sense.

@BurntSushi BurntSushi force-pushed the ag/upgrade-annotate-snippets branch 5 times, most recently from bc2167a to 620bd98 Compare January 14, 2025 19:21
@dhruvmanila dhruvmanila added internal An internal refactor or improvement diagnostics Related to reporting of diagnostics. labels Jan 15, 2025
This merely adds the crate to our repository. Some cosmetic changes are
made to make it work in our repo and follow our conventions, such as
changing the name to `ruff_annotate_snippets`. We retain the original
license information. We do drop some things, such as benchmarks, but
keep tests and examples.
This is a tiny change that, perhaps slightly shady, permits us to use
the `annotate-snippets` renderer without its mandatory header (which
wasn't there in `annotate-snippets 0.9`). Specifically, we can now do
this:

    Level::None.title("")

The combination of a "none" level and an empty label results in the
`annotate-snippets` header being skipped entirely. (Not even an empty
line is written.)

This is maybe not the right API for upstream `annotate-snippets`, but
it's very easy for us to do and unblocks the upgrade (albeit relying on
a vendored copy).

Ref rust-lang/annotate-snippets-rs#167
This is pretty much just moving to the new API and taking care to use
byte offsets. This is *almost* enough. The next commit will fix a bug
involving the handling of unprintable characters as a result of
switching to byte offsets.
Previously, we were replacing unprintable ASCII characters with a
printable representation of them via fancier Unicode characters. Since
`annotate-snippets` used to use codepoint offsets, this didn't make our
ranges incorrect: we swapped one codepoint for another.

But now, with the `annotate-snippets` upgrade, we use byte offsets
(which is IMO the correct choice). However, this means our ranges can be
thrown off since an ASCII codepoint is always one byte and a non-ASCII
codepoint is always more than one byte.

Instead of tweaking the `ShowNonprinting` trait and making it more
complicated (which is used in places other than this diagnostic
rendering it seems), we instead change `replace_whitespace` to handle
non-printable characters. This works out because `replace_whitespace`
was already updating the annotation range to account for the tab
replacement. We copy that approach for unprintable characters.
It's hard to grok the change from the snapshot diffs alone, so here's
one example. Before:

    PYI021.pyi:15:5: PYI021 [*] Docstrings should not be included in stubs
       |
    14 |   class Baz:
    15 |       """Multiline docstring
       |  _____^
    16 | |
    17 | |     Lorem ipsum dolor sit amet
    18 | |     """
       | |_______^ PYI021
    19 |
    20 |       def __init__(self) -> None: ...
       |
       = help: Remove docstring

And now after:

    PYI021.pyi:15:5: PYI021 [*] Docstrings should not be included in stubs
       |
    14 |   class Baz:
    15 | /     """Multiline docstring
    16 | |
    17 | |     Lorem ipsum dolor sit amet
    18 | |     """
       | |_______^ PYI021
    19 |
    20 |       def __init__(self) -> None: ...
       |
       = help: Remove docstring

I personally think both of these are fine. If we felt strongly, I could
investigate reverting to the old style, but the new style seems okay to
me.

In other words, these updates I believe are just cosmetic and not a bug
fix.
The previous rendering just seems wrong in that a `^` is omitted. The
new version of `annotate-snippets` seems to get this right. I checked a
pseudo random sample of these, and it seems to only happen when the
position pointed at a line terminator.
This looks like a bug fix that occurs when the annotation is a
zero-width span immediately following a line terminator. Previously, the
caret seems to be rendered on the next line, but it should be rendered
at the end of the line the span corresponds to.

I admit that this one is kinda weird. I would somewhat expect that our
spans here are actually incorrect, and that to obtain this sort of
rendering, we should identify a span just immediately _before_ the line
terminator and not after it. But I don't want to dive into that rabbit
hole for now (and given how `annotate-snippets` now renders these
spans, perhaps there is more to it than I see), and this does seem like
a clear improvement given the spans we feed to `annotate-snippets`.
I believe this case is different from the last in that it happens when
the end of a *multi-line* annotation occurs after a line terminator.
Previously, the diagnostic would render on the next line, which is
definitely a bit weird. This new update renders it at the end of the
line the annotation ends on.

In some cases, the annotation was previously rendered to point at source
lines below where the error occurred, which is probably pretty
confusing.
This group of updates is similar to the last one, but they call out the
fact that while the change is an improvement, it does still seem to be a
little buggy.

As one example, previously we would have this:

       |
     1 | / from __future__ import annotations
     2 | |
     3 | | from typing import Any
     4 | |
     5 | | from requests import Session
     6 | |
     7 | | from my_first_party import my_first_party_object
     8 | |
     9 | | from . import my_local_folder_object
    10 | |
    11 | |
    12 | |
    13 | | class Thing(object):
       | |_^ I001
    14 |     name: str
    15 |     def __init__(self, name: str):
       |
       = help: Organize imports

And now here's what it looks like after:

       |
     1 | / from __future__ import annotations
     2 | |
     3 | | from typing import Any
     4 | |
     5 | | from requests import Session
     6 | |
     7 | | from my_first_party import my_first_party_object
     8 | |
     9 | | from . import my_local_folder_object
    10 | |
    11 | |
    12 | |
       | |__^ Organize imports
    13 |   class Thing(object):
    14 |     name: str
    15 |     def __init__(self, name: str):
       |
       = help: Organize imports

So at least now, the diagnostic is not pointing to a completely
unrelated thing (`class Thing`), but it's still not quite pointing to
the imports directly. And the `^` is a bit offset. After looking at
some examples more closely, I think this is probably more of a bug
with how we're generating offsets, since we are actually pointing to
a location that is a few empty lines _below_ the last import. And
`annotate-snippets` is rendering that part correctly. However, the
offset from the left (the `^` is pointing at `r` instead of `f` or even
at the end of `from . import my_local_folder_object`) appears to be a
problem with `annotate-snippets` itself.

We accept this under the reasoning that it's an improvement, albeit not
perfect.
…pace

I separated out this snapshot update since the string of `^` including
whitespace looked a little odd. I investigated this one specifically,
and indeed, our span in this case is telling `annotate-snippets` to
point at the whitespace. So this is `annotate-snippets` doing what it's
told with a mildly sub-optimal span.

For clarity, the before rendering is:

    skip.py:34:1: I001 [*] Import block is un-sorted or un-formatted
       |
    32 |       import sys; import os  # isort:skip
    33 |       import sys; import os  # isort:skip  # isort:skip
    34 | /     import sys; import os
       |
       = help: Organize imports

And now after is:

    skip.py:34:1: I001 [*] Import block is un-sorted or un-formatted
       |
    32 |     import sys; import os  # isort:skip
    33 |     import sys; import os  # isort:skip  # isort:skip
    34 |     import sys; import os
       | ^^^^^^^^^^^^^^^^^^^^^^^^^ I001
       |
       = help: Organize imports

This is a clear bug fix since it adds in the `I001` annotation, even
though the carets look a little funny by including the whitespace
preceding `import sys; import os`.
This one almost looks like it fits into the other failure categories,
but without identifying root causes, it's hard to say for sure. The span
here does end after a line terminator, so it feels like it's like the
rest.

I also isolated this change since I found the snapshot diff pretty hard
to read and wanted to look at it more closely. In this case, the before
is:

    E204.py:31:2: E204 [*] Whitespace after decorator
       |
    30 |   # E204
    31 |   @ \
       |  __^
    32 | | foo
       | |_^ E204
    33 |   def baz():
    34 |       print('baz')
       |
       = help: Remove whitespace

And the after is:

    E204.py:31:2: E204 [*] Whitespace after decorator
       |
    30 | # E204
    31 | @ \
       |  ^^ E204
    32 | foo
    33 | def baz():
    34 |     print('baz')
       |
       = help: Remove whitespace

The updated rendering is clearly an improvement, since `foo` itself is
not really the subject of the diagnostic. The whitespace is.

Also, the new rendering matches the span fed to `annotate-snippets`,
where as the old rendering does not.
…ource

The change to the rendering code is elaborated on in more detail here,
where I attempted to upstream it:
rust-lang/annotate-snippets-rs#169

Otherwise, the snapshot diff also shows a bug fix: a `^` is now rendered
where as it previously was not.
This fix was sent upstream and the PR description includes more details:
rust-lang/annotate-snippets-rs#170

Without this fix, there was an errant snapshot diff that looked like
this:

  |
1 |   version = "0.1.0"
2 |   # Ensure that the spans from toml handle utf-8 correctly
3 |   authors = [
  |  ___________^
4 | |     { name = "Z...ALGO", email = 1 }
5 | | ]
  | |_^ RUF200
  |

That ellipsis should _not_ be inserted since the line is not actually
truncated. The handling of line length (in bytes versus actual rendered
length) wasn't quite being handled correctly in all cases.

With this fix, there's (correctly) no snapshot diff.
We do this because `...` is valid Python, which makes it pretty likely
that some line trimming will lead to ambiguous output. So we add support
for overriding the cut indicator. This also requires changing some of
the alignment math, which was previously tightly coupled to `...`.

For Ruff, we go with `…` (`U+2026 HORIZONTAL ELLIPSIS`) for our cut
indicator.

For more details, see the patch sent to upstream:
rust-lang/annotate-snippets-rs#172
This updates snapshots where long lines now get trimmed with
`annotate-snippets`. And an ellipsis is inserted to indicate trimming.

This is a little hokey to test since in tests we don't do any styling.
And I believe this just uses the default "max term width" for rendering.
But in real life, it seems like a big improvement to have long lines
trimmed if they would otherwise wrap in the terminal. So this seems like
an improvement to me.

There are some other fixes here that overlap with previous categories.
This looks like a bug fix since the caret is now pointing right at the
position of the unprintable character. I'm not sure if this is a result
of an improvement via the `annotate-snippets` upgrade, or because of
more accurate tracking of annotation ranges even after unprintable
characters are replaced. I'm tempted to say the former since in theory
the offsets were never wrong before because they were codepoint offsets.

Regardless, this looks like an improvement.
This change also requires some shuffling to the offsets we generate for
the diagnostic. Previously, we were generating an empty range
immediately *after* the line terminator and immediate before the first
byte of the subsequent line. How this is rendered is somewhat open to
interpretation, but the new version of `annotate-snippets` chooses to
render this at the end of the preceding line instead of the beginning of
the following line.

In this case, we want the diagnostic to point to the beginning of the
following line. So we either need to change `annotate-snippets` to
render such spans at the beginning of the following line, or we need to
change our span to point to the first full character in the following
line. The latter will force `annotate-snippets` to move the caret to the
proper location.

I ended up deciding to change our spans instead of changing how
`annotate-snippets` renders empty spans after a line terminator. While I
didn't investigate it, my guess is that they probably had good reason
for doing so, and it doesn't necessarily strike me as _wrong_.
Furthermore, fixing up our spans seems like a good idea regardless, and
was pretty easy to do.
This update includes some missing `^` in the diagnostic annotations.

This update also includes some shifting of "syntax error" annotations to
the end of the preceding line. I believe this is technically a
regression, but fixing them has proven quite difficult. I *think* the
best way to do that might be to tweak the spans generated by the Python
parser errors, but I didn't want to dig into that. (Another approach
would be to change the `annotate-snippets` rendering, but when I tried
that and managed to fix these regressions, I ended up causing a bunch of
other regressions.)

Ref 77d4545#r1915458616
Previously, these were pointing to the right place, but were missing the
`^`. With the `annotate-snippets` upgrade, the `^` was added, but they
started pointing to the end of the previous line instead of the
beginning of the following line. In this case, we really want it to
point to the beginning of the following line since we're calling out
indentation issues.

As in a prior commit, we fix this by tweaking the offsets emitted by the
lint itself. Instead of an empty range at the beginning of the line, we
point to the first character in the line. This "forces" the renderer to
point to the beginning of the line instead of the end of the preceding
line.

The end effect here is that the rendering is fixed by adding `^` in the
proper location.
@BurntSushi BurntSushi force-pushed the ag/upgrade-annotate-snippets branch from 620bd98 to 458153d Compare January 15, 2025 15:40
@BurntSushi
Copy link
Member Author

OK, I think this should be ready for re-review!

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. It's unfortunate that we have to adjust the ranges for some lint rules, but I think it's the right trade-off. Would you mind opening an issue to track the "shortcoming" so that we don't forget about it?

A few changes to parser diagnostics might be due to the same bug. Although I'm not 100% certain, it's definitely possible that the parser emits the wrong ranges here.

The relevant code (just for reference) is:

self.add_error(
recovery_context_kind.create_error(self),
self.current_token_range(),
);

I don't suggest you change the emitted range. I'd be okay accepting this regression for now. But we may want to double check if it's the parser's or or rendering fault.

One thing I want to quickly test is if the changed ranges impact the VS code extension in anyway because it uses the same ranges as the generated diagnostics.

@@ -198,8 +197,8 @@ Module(
|
3 | exept: # spellchecker:disable-line
4 | pass
| ^ Syntax Error: Expected a statement
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I guess this is another case where the offset used to be at the start of the line and the caret now ends up at the end of the previous line. Unfortunately, this will be harder to fix, and I don't think we have tests for every position where we expect a node.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aye, I filed an issue about this: #15510

@@ -77,4 +76,5 @@ Module(
|
2 | if True)):
3 | pass
| ^ Syntax Error: Expected a statement
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only noticed those now. They should probably point at the start of the next line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was what I mentioned in one of my other comments. I tried fixing this, but it's hard. I would either need to change the error offsets produced by the parser (which I'm not sure I really want to touch) or I'd need to fix annotate-snippets to render empty spans immediately after a line terminator to point at the beginning of the following line instead of the end of the preceding line.

@MichaReiser
Copy link
Member

MichaReiser commented Jan 15, 2025

Okay, somewhat "bad" news. The new diagnostic ranges do regress the LSP experience. It's not in a significant way that I think is worth blocking this PR but it is reason enough that we should change annotation-snippet to support empty ranges pointing at the start of a line. This isn't something we have to follow up on immediately but I consider it "serious" enough that it's something we should act on in the coming months.

I tested it with E115 using

if False:  
print()

Before

The quick fix was only shown when the cursor is at the start of the line:

if False:  
print()
^-- cursor has to be here

Now

The quick fix is now shown if the cursor is at the start of the line or between p and r. This is incorrect because the action doesn't make sense at that position:

if False:  
print()
^ ---- cursor can be here
 ^ ----- or here

@BurntSushi
Copy link
Member Author

Would you mind opening an issue to track the "shortcoming" so that we don't forget about it?

Done here: #15509

I don't suggest you change the emitted range. I'd be okay accepting this regression for now. But we may want to double check if it's the parser's or or rendering fault.

Yeah I did look into this earlier. And I think our spans are correct. I thought I left a comment about it, but it probably got lost in the see of collapsed messages above haha. In any case, I filed an issue about this: #15510

@BurntSushi
Copy link
Member Author

@MichaReiser had an idea that maybe we can just "fix up" the ranges generated with the diagnostics in the renderer. I think that's worth a try, so I'm going to give that a go before merging here.

Instead of doing this on a lint-by-lint basis, we now just do it right
before rendering. This is more broadly applicable.

Note that this doesn't fix the diagnostic rendering for the Python
parser. But that's using a different path anyway (`annotate-snippets` is
only used in tests).
@BurntSushi BurntSushi force-pushed the ag/upgrade-annotate-snippets branch from 282de25 to 6b3bda4 Compare January 15, 2025 18:10
@BurntSushi
Copy link
Member Author

@MichaReiser had an idea that maybe we can just "fix up" the ranges generated with the diagnostics in the renderer. I think that's worth a try, so I'm going to give that a go before merging here.

This ended up working. I think this will fix the LSP regression?

@BurntSushi BurntSushi merged commit c9b99e4 into main Jan 15, 2025
21 checks passed
@BurntSushi BurntSushi deleted the ag/upgrade-annotate-snippets branch January 15, 2025 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
diagnostics Related to reporting of diagnostics. internal An internal refactor or improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update annotate-snippets to 0.11.0
6 participants