Consider ways to work around stats::filter() when applying dplyr::filter() lints #2078

MichaelChirico · 2023-08-10T23:02:43Z

Follow-up to #2077.

That PR is conservative about what filter() calls to include. We do not match filter() unless it's namespace-qualified, to avoid false positives involving stats::filter(). I'm not so familiar with stats::filter(), but my understanding is it'd be really weird to use & there too, but in any case the lint message would look strange.

We might parameterize this instead to increase the reach, e.g. assume_dplyr , when TRUE all filter() calls are matched, or allow_conjunct_filter, when TRUE all filter() calls are skipped.

Another less conservative option: assume filter() is dplyr::filter(), unless it's namespace-qualified as coming from another namespace. That gives users an out if needed, but defaults to assuming everything comes from dplyr (which seems most likely, among users who'd have this lint active)

The text was updated successfully, but these errors were encountered:

MichaelChirico · 2023-08-10T23:04:45Z

Some tests I'd written about this future behavior:

# TODO(michaelchirico): shut these off to stay on the conservative side and
#   only lint for calls that we _know_ are coming from dplyr. consider
#   whether to use an argument to change this, or if we can improve the
#   logic to ensure dplyr::filter() is being used.
# using & in stats::filter() calls should be uncommon, but ensure
#   either dplyr:: is used or there's no namespace qualification
expect_lint(
  "stats::filter(A & B)",
  NULL,
  conjunct_test_linter()
)
expect_lint(
  "ns::filter(A & B)",
  NULL,
  conjunct_test_linter()
)
expect_lint(
  "DF %>% filter(A & B)",
  "Use dplyr::filter\\(DF, A, B\\) instead of dplyr::filter\\(DF, A & B\\)",
  conjunct_test_linter()
)

AshesITR · 2023-08-11T04:58:49Z

assume_dpylr_loaded = FALSE should be opt-in I think.

Or instead we do a dynamic check if the globally loaded / NAMESPACE-Imported filter() is from dplyr to remove the NS prefix condition in the XPath?

I think both approaches would greatly increase the amount of true positives.

Also good call we should never lint package-prefixed calls from other packages than dplyr.

AshesITR · 2023-08-13T07:08:19Z

I've also seen dplyr::filter(DF, A && B) a lot from beginners. Would this be the right place to also lint this common mistake?

MichaelChirico · 2023-08-13T08:01:33Z

I've also seen this... maybe more appropriate for vector_logic_linter()?

MichaelChirico · 2023-09-07T03:54:20Z

Now that we have allow_filter, maybe we should combine this issue with that FR.

WDYT about allow_filter = c("assume_dplyr", "strict_dplyr", "always")? The first applies to all unqualified filter() calls; the second only to dplyr::filter(); the third doesn't check filter() calls. cc @salim-b for viz.

AshesITR · 2023-09-07T05:09:12Z

The options, especially "strict_dplyr" seem unintuitive to me.

Maybe "never", "not_dplyr", "always"?

MichaelChirico · 2023-09-07T05:25:02Z

"never" also sounds too strict, right? Since qualified stats::filter() would never lint. Maybe "qualified_dplyr" sounds better than "strict_dplyr"?

AshesITR · 2023-09-07T05:37:12Z

But we dont "allow_filter" "qualified_dplyr" 😅

We allow everything but qualified dplyr calls.
Not happy with allow_filter = c("qualified"? "unqualified", "always"), but maybe it inspires you.

AshesITR · 2023-09-07T05:40:51Z

Another idea: allow_dplyr_filter = c("never", "unqualified", "always")

salim-b · 2023-09-07T09:36:26Z

Just my 5 cents: If it's possible for lintr to auto-detect this, i.e. to

do a dynamic check if the globally loaded / NAMESPACE-Imported filter() is from dplyr to remove the NS prefix condition in the XPath?

I'd strongly prefer that instead of extending parameters. 🙂

MichaelChirico · 2023-09-07T16:35:40Z

I'd strongly prefer that instead of extending parameters. 🙂

We might do that in addition to extending parameters, but requiring that is not appealing to me in general -- it requires the machine executing the linter to have dplyr installed, which may be true for developers working locally on their own package, but is not true by default in distributed work environments.

Another suggestion:

allow_filter = c("assuming_dplyr", "checking_dplyr", "always")

Otherwise, Maybe "never", "not_dplyr", "always"? is growing on me...

MichaelChirico added feature a feature request or enhancement false-negative code that should lint, but doesn't labels Aug 10, 2023

MichaelChirico mentioned this issue Aug 10, 2023

extend conjunct_test_linter for dplyr::filter() #2077

Merged

MichaelChirico added this to the 3.1.1 milestone Aug 22, 2023

This was referenced Sep 14, 2023

Extend vector_logic_linter() to include dplyr::filter(x, A && B) #2166

Closed

Extend allow_filter in conjunct_test_linter() to allow linting only dplyr::filter() (i.e. qualified) #2169

Merged

MichaelChirico closed this as completed in #2169 Sep 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider ways to work around stats::filter() when applying dplyr::filter() lints #2078

Consider ways to work around stats::filter() when applying dplyr::filter() lints #2078

MichaelChirico commented Aug 10, 2023 •

edited

Loading

MichaelChirico commented Aug 10, 2023

AshesITR commented Aug 11, 2023

AshesITR commented Aug 13, 2023

MichaelChirico commented Aug 13, 2023

MichaelChirico commented Sep 7, 2023 •

edited

Loading

AshesITR commented Sep 7, 2023

MichaelChirico commented Sep 7, 2023

AshesITR commented Sep 7, 2023 •

edited

Loading

AshesITR commented Sep 7, 2023

salim-b commented Sep 7, 2023

MichaelChirico commented Sep 7, 2023

Consider ways to work around stats::filter() when applying dplyr::filter() lints #2078

Consider ways to work around stats::filter() when applying dplyr::filter() lints #2078

Comments

MichaelChirico commented Aug 10, 2023 • edited Loading

MichaelChirico commented Aug 10, 2023

AshesITR commented Aug 11, 2023

AshesITR commented Aug 13, 2023

MichaelChirico commented Aug 13, 2023

MichaelChirico commented Sep 7, 2023 • edited Loading

AshesITR commented Sep 7, 2023

MichaelChirico commented Sep 7, 2023

AshesITR commented Sep 7, 2023 • edited Loading

AshesITR commented Sep 7, 2023

salim-b commented Sep 7, 2023

MichaelChirico commented Sep 7, 2023

MichaelChirico commented Aug 10, 2023 •

edited

Loading

MichaelChirico commented Sep 7, 2023 •

edited

Loading

AshesITR commented Sep 7, 2023 •

edited

Loading