-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lower down the number of disk operations #94
Comments
Hi! Yet, the question occurs, is it worth it? By minimizing the reading from the disk, I could increase the speed - but with current model, the speed is still satisfactory, so I don't see it as a high priority. I see more benefits in low memory consumption, and keeping in RAM only what is necessary at the time. I can understand that it is an issue when you use PE-sieve on a sandbox - but to be very honest, sandboxes are not the target environment to which this tool is dedicated. Not only because of the problem that you described, but also because of the fact that sandbox environment can generate many in-memory artifacts that will be picked up by PE-sieve unnecessarily, and generate noise. Also, the sandbox already monitors your API calls - so it can help you picking up the implants by other ways. |
Thank you for your kindly reply. Let me please to clarify a bit. To be more precise, it about 250k+ disk events in a few seconds. Is it a lot? Maybe not. However, for example, if we take common browser it generates about 20k+ operations in the first minute, then much less. Browsers are quite heavy today. If we take some kind of search tool it will be similar to "mal_unpack". Usually we run unpacker up to 10 minutes, so it is gonna be about ~100000k disk events. It includes reading, writing and spending physical resources. Answering your question:
It seems yes. Сonsidering the facts that (1) system DLLs load only once and "all programs share the same in-memory copy of code" (link), where only in case of changes new page will be allocated. Moreover, usually malware uses a limit number of DLLs (2), so you don't need to allocate a lot of memory. So, it might be caching DLLs will not lead to high memory consumption, but will cause less resource wasting.
This is not really necessary. It depends on the sandbox. Anyway, I highly appreciate your great efforts in developing this project. Thank you. |
Thank you for your remarks.
It is not that simple in this case. Each DLL, before it can be compared, has to be loaded manually and preprocessed. The modules that I am scanning cannot be loaded into PE-sieve just by LoadLibrary, because of various reasons. And even if it was fine to load them like this, they still take up space withing the process memory, and you still need to read the file to load them.
PE-sieve scans various processes, and not all of them use small number of DLLs, so this argument doesn't really hold. Also I can't really agree that malware uses small number of DLLs. Malware usually has small number of DLLs in the import table - but can load a lot of DLLs as it runs, and PE-sieve has to scan all of them. Yet, I do understand the problem:
I will try to minimize it, first by refactoring the existing code, then, eventually by adding the caching. Also, I can agree to cache some of the most often used DLLs, such as:
Honestly speaking, I didn't plan PE-sieve to be used on sandboxes, but I always try to come forward user's needs. |
@AndyWatterman - I implemented the caching, please check it out and share your opinion! It can be enabled in |
Have a nice day!
I find out that during process scanning "pe-sieve" as well as "mal_unpack" does a huge amount of disk operations. This is fine when you are inside VM at the physical machine. However, if you are in the sandbox environment which analysis any disk operation it causes a problem.
I did not analyze code, but I think the problem is comparing between mapped image and the original ones. Would it better to have a kind of cache of the most often used libraries? Or at least map library images to avoid disk operations? It might be an option?
The text was updated successfully, but these errors were encountered: