Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vmray: loosen file checks to enable processing of additional file types #2571

Merged
merged 9 commits into from
Jan 23, 2025

Conversation

mike-hunhoff
Copy link
Collaborator

@mike-hunhoff mike-hunhoff commented Jan 22, 2025

closes #2403

This PR addresses two issues:

  1. Loosen file checks to enable processing of additional file types, e.g. PS1 and MSI. We handle this by largely ignoring file and global feature extraction because we care most about the dynamic trace analysis anyways. This accepts that capa's results may be incomplete but still useful.
  2. Handle VMRay analysis archives that may have more than one file marked as the submission, e.g. compound ZIP files. Unfortunately, I have not identified a sure way of differentiating which of the files is the actual submission. Instead, we rely on the ordering of the files and use the last file marked as the submission file. This appears to hold true for compound ZIP files.

@mike-hunhoff mike-hunhoff marked this pull request as draft January 22, 2025 18:31
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased) section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed

@github-actions github-actions bot dismissed their stale review January 22, 2025 18:43

CHANGELOG updated or no update needed, thanks! 😄

@mike-hunhoff mike-hunhoff marked this pull request as ready for review January 22, 2025 20:32
@mike-hunhoff mike-hunhoff requested a review from a team January 22, 2025 20:32
Copy link
Collaborator

@williballenthin williballenthin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should add a test case for this setup, because we don't want future code to accidentally start relying on things that might not be present. is this reasonably possible @mike-hunhoff ?

CHANGELOG.md Show resolved Hide resolved
capa/features/extractors/vmray/extractor.py Outdated Show resolved Hide resolved
capa/features/extractors/vmray/extractor.py Show resolved Hide resolved
@mike-hunhoff
Copy link
Collaborator Author

mike-hunhoff commented Jan 23, 2025

i think we should add a test case for this setup, because we don't want future code to accidentally start relying on things that might not be present. is this reasonably possible @mike-hunhoff ?

I've added a test in 438d911 for a minimized PowerShell script trace. Because this is PowerShell trace there is no static data so any future code that assumes static data is present should cause this test to fail.

@williballenthin
Copy link
Collaborator

send it!

@mike-hunhoff mike-hunhoff merged commit 160ce73 into master Jan 23, 2025
27 checks passed
@mike-hunhoff mike-hunhoff deleted the fix/2403 branch January 23, 2025 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants