-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add lint to avoid duplicate features #2250
Comments
hello @mr-tz . I want to work on this issue. Can you please assign it to me? |
you can use anything in the standard library (that is, doesn't require installation from PyPI) as well as anything we already rely upon (see pyproject or requirements.txt). we can also add dependencies as necessary, though we'd prefer a good reason to do so, since they add a little risk and maintenance burden. |
also, go ahead and work on it! |
To what extent do we want to remove duplicate features? We can detect the duplicate features under same |
let's start with the most trivial duplicates, such as the same value under AND or OR. that's what we fixed in the linked PR. |
INFO lint: collecting potentially referenced samples lint.py:1034
⠹ linting rule: compiled to the .NET platform 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/1 rule • 0:00:00 < -:--:-- • 00.0 rule/s
compiled to the .NET platform
FAIL: rule contains a duplicate feature under `or`/`and` statement: remove the duplicate features
duplicate line: " - import: mscoree._cordllmain" : line numbers: 17, 19
rules with FAIL:
- compiled to the .NET platform My current implementation will print in this format, is it fine? It take cares of duplicate features under a single |
that's great! |
@williballenthin I have made the PR but there are some issues with it. We cannot use - string: /dbghelp\.dll/i
description: WindBG
- string: /dbghelp\.dll/i
description: WINE This is due to the feature being multi-lined. So, I wanted to ask what your suggestions are for proceeding with this issue. |
In # in the example code, in debug mode, the array is constructed bytewise on the stack
- basic block:
- and:
- description: bExtendedAttributes for IPv4 TCP on stack, bytewise
# i've kept the values approximately in order while removing some duplicates for clarity
- number: 0x00
- number: 0x0F
- number: 0x1E
- number: 0x41 = A
- number: 0x66 = f
- number: 0x64 = d
- number: 0x4F = O
- number: 0x70 = p
- number: 0x65 = e
- number: 0x6E = n
- number: 0x50 = P
- number: 0x61 = a
- number: 0x63 = c
- number: 0x6B = k
- number: 0x65 = e
- number: 0x74 = t
- number: 0x58 = X
- number: 0x02
- number: 0x01
- number: 0x06
- number: 0x00
- number: 0x60
- number: 0xEF
- number: 0x3D
- number: 0x47
- number: 0xFE Are |
Yes |
Ok, then. My updated lint has found some duplicate features in current set of rules and due to that reason, my commits are unable to pass the |
Yes, that would be great 👍🏼 |
e.g. duplicate API features such as removed in mandiant/capa-rules#916
The text was updated successfully, but these errors were encountered: