-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should blacklisted_websites be anchored? #2796
Comments
Sounds buggish to me |
I'd also looked into this, and found the same thing. My opinion is that these definitely should be anchored. Not anchoring them results in FP and is contrary to most people's expectations when moving something from the watchlist to the website blacklist. If a particular regex is intended to be matched unanchored, then it should be specifically constructed to do so, rather than all regexes in the list needing to be constructed with their own individual anchors. My impression from previously looking at the regexes was that some few of the regexes rely on the fact that it's not anchored, but that the majority were written with the expectation that they are anchored and occasionally produce FP. Unfortunately, the only way I've seen to accurately determine if each individual regex is intended to be anchored or unanchored would be to run each regex through a MS search with and without the anchor which we choose and look at the number of TP and FP that are produced in each case. Note: the anchor needs to be more complex than just |
Currently, for those website blacklist items which I see detect something they aren't intended to detect, I'm using bookends of:
|
This issue has been closed because it has had no recent activity. If this is still important, please add another comment and find someone with write permissions to reopen the issue. Thank you for your contributions. |
Copied from Charcoal-SE#2796 (comment)
I thought this was a regression but it looks (from going back a couple of years in Git history) that the regex for blacklisted websites was in fact never anchored.
We have a false positive where
ello.co
matchestrello.com
and of course, fixing that in isolation would be trivial; but do we really have any cases where the absence of anchoring is actually a feature, or is this a bug which should have been fixed long ago?The text was updated successfully, but these errors were encountered: