[Feature Request] Calculate a checksum for each whitelisted executable #413

wirespecter · 2021-05-13T20:07:16Z

Hi there,

First of all, let me say a big thank you for this awesome program! It seems to be working well and I love it!

I think that it would be greatly improved if it calculates a checksum for each file that is in the whitelist.
Why? Because a malicious application can replace a whitelisted program with itself in order to avoid detection.

For example: Let's say that /usr/bin/ping is whitelisted, if a malware overwrites ping the malicious ping that resides in the same path is whitelisted as well.

All of this can be avoided if we compare the malicious ping's checksum with the checksum of the "good" ping that we got when it was whitelisted.

I know that this may complicate things when an application gets updated (thus the checksum changes), but we can add this as an option in the menu for users that want an additional level of security.

Thanks for reading so far and please let me know your thoughts below :)

The text was updated successfully, but these errors were encountered:

gustavo-iniguez-goya · 2021-05-13T23:56:42Z

hey @wirespecter ,

This is an interesting feature request. I agree with you in all points.

But maybe we could cause some confusion, if as you said, you've whitelisted a binary and it's updated. So an option to enable it should be mandatory.

On the other hand, when you allow a new process, you don't know if it's goodware or malware.

In any case, at the very least, if the checksum changes we could warn users about it.

wirespecter · 2021-05-14T08:18:41Z

Thanks @gustavo-iniguez-goya ,

On the other hand, when you allow a new process, you don't know if it's goodware or malware.

As with any other process that we whitelist, it is up to the user to decide.

In any case, at the very least, if the checksum changes we could warn users about it.

I agree with this. However, it depends on the default behavior that is going to be implemented after the warning has taken place.

If we are talking about a warning dialog that lets users choose if they still want to allow the executable to communicate with the outside world or not then I'm up for it.
If it is something passive that just "warns" about a checksum change with no options or actions to be made, that may not be the best solution in my humble opinion.

dmknght · 2021-05-17T21:17:27Z

I don't really agree with this idea. I mean yeah it can add more protections for user but in my opinion it has some disadvantages:

For now OpenSnitch is just a firewall so it works on connections of processes only. The binary verification is more like Antivirus thing. It is something like different part.
What if modification is not malicious thing? For example i customized ping and compile it for better performance. Checksum creates false positive.
New method can create new bypass ways and if developers and maintainers want to increase current feature they have to take care of more work -> high development cost.
When binary is infected, it can create normal connections and malicious connections. If we want to detect the malicious connections, opensnitch will be something like IDS. Alternative solution is create blacklist addresses.

In my opinion, alternative solution could be create whitelist of protocol that binary can use. Like ping can't send http requests. I didn't play with OpenSnitch rules so much so i don't know if it is doable.

But i am seeing an interesting point. Let say that, ping is our target. So if user runs ping, it will be called from terminal. So ping is whitelisted, but when ping starts from a random process / system will be different from ping that is started from terminal that user is running? Because OpenSnitch can get process information so i guess this can be used? I think it is not a perfect solution but it can be done?

P/s: I think this is a really really interesting topic to increase protections so i would like to discus more and learn from everybody ^^ Thank you @wirespecter for a very interesting topic :D

NRGLine4Sec · 2021-05-18T08:53:09Z

To my mind, @dmknght is right. The File Intregrity Monitoring functionality is more a HIDS function than a firewall function. There are already tools to do FIM like Samhain or Wazuh.
Let's see what others think about it.

gustavo-iniguez-goya · 2021-05-18T12:57:51Z

Before dumping some thoughts: how often will this scenario occur? I see 3 possible scenarios:

The system is compromised, the attacker has root access and backdorizes several system binaries.
There's a binary running as user RMS, an attacker gains access to this account and modifies the binary.
The attacker copies an allowed binary to /tmp or /var/tmp.

In the later case, the admin should allow binaries specifying the absolute path to the binary.

it can add more protections for user

More than adding protection I see it as a way of letting the user know that something changed since that binary was allowed (or denied). You can't configure it right now, but I think the checksum is just a property of a rule that it's not configurable.

If you have allowed nc + dst.port 80|443 + dst.ip 1.1.1.1 and something/someone launches it to connect to malicious.com:443 opensnitch will prompt you to allow or deny it, because the properties of the connection don't match the rule you defined.

However we're implicitly allowing nc, trusting that the binary is goodware in both cases, regardless if it's the same binary or not.

What if modification is not malicious thing? For example i customized ping and compile it for better performance

In that case you'd allow your ping binary version, with a fixed checksum. You'd allow it only once.
If it happens that you compile it regularly there could be 2 options: a global option to verify checksums, and a rule field to verify the checksum of a particular binary ([x] Verify binary checksum)

The binary verification is more like Antivirus thing. It is something like different part.

We wouldn't be acting as an AV here because we wouldn't be classifying the binary as goodware, grayware or malware, nor killing it and nor moving it to a quarantine.

But I agree that it adds complexity and probably performance penalty.

So if user runs ping, it will be called from terminal. So ping is whitelisted, but when ping starts from a random process / system will be different from ping that is started from terminal that user is running? Because OpenSnitch can get process information so i guess this can be used? I think it is not a perfect solution but it can be done?

In this case, if you have allowed ping, it'll be allowed regardless if it's launched from a terminal or spawned by a different process

The File Intregrity Monitoring functionality is more a HIDS function than a firewall function.

I agree. However opensnitch it's not a traditional firewall where you only allow/deny network connections. There's an extra component that is also part of the connection, which is the binary that initiated it.

Binaries also have their own properties: path, name, arguments, current working directory (from where it was launched), permissions, ownership, checksum, size, parent pid, etc.
We're already able to filter by some of these properties. For example:

we can filter by parts of the binary path (directory, name, etc)
we can filter by arguments of the binary.
binary path, arguments and name are case-insensitive by default (configurable)

Filtering by checksum is not something urgent right now, as there're many other things to do first. But it doesn't seem to be that crazy to me.

Anyway, it'd probably be better to implement it as an addon rather than adding it to the codebase.

dmknght · 2021-05-19T20:02:46Z

I'm having an idea: Can we create some work modes for OpenSnitch?
For example: Comodo firewall has safe mode, game mode, ... something like that. So i'm imaging something like

Safe mode: Only accept rules for current login.
Normal mode: Like current work
It is not really practical but if GUI becomes more simple, it may work. It needs more design to be effective
From Kaspersky IS, it has App firewall and has trusted apps, untrusted apps, Low restricted, High restricted... so with this, we can have different method for each process. So trusted apps is like whitelisted applications, other Trusted rules will have different behaviors to ask users about connections. This is my very scratch idea so well i'm just telling it here to discus. So imo, with this way we don't go too much in binary verification but more about actions when binary is executed.

elesiuta · 2021-08-30T19:43:29Z

I wrote a small program inspired by this one a while back with this exact feature as well as the option to upload the checksums to VirusTotal.

I recently fixed it up a bit ~~but the memory usage still isn't great and I don't plan on giving it much attention in the foreseeable future,~~ and figured @wirespecter and others who see this issue may be interested.

sl805 · 2021-11-03T08:19:03Z

@gustavo-iniguez-goya Can't tell how much I'm thankful for this piece of software, thanks for all your work. Since this is networking focused we can adjust this feature accordingly. VirusTotal allows domain validation, so in such case if user supplies VT API key can we add an option to validate domain application tries access (via VirusTotal API) and show validation status Bad\Good\Unknown in Allow\Deny connection dialogue ? This way we're keeping focus on networking, yet improving security.

Best regards.

gustavo-iniguez-goya · 2021-11-11T23:19:24Z

Hi @sl805 , thank you for the kind words :)

I think that making requests to an external url is something that many users would be against with. I created PoC of your idea sometime ago and posted a screenshot somewhere. It's really easy to integrate VT, at least on the GUI, but we need consensus on these new features.

cypherbits · 2021-11-12T12:41:28Z

I'm against using an external service for AV checking. This is an application firewall not an Antivirus...

But, as an application firewall, we allow "executables" to connect to the Internet. If we cannot ensure our executable is the one we whitelisted and any malware can replace any executable and ping home, then we are failing our firewall mission since that malicious executable is not whitelisted. Because that executable is not actually our executable.

We just actually whitelist paths, not executables.

That is why we should implement an optional hashing. I proposed Blake3 because it should be the fastest but safe.

NRGLine4Sec · 2021-11-12T15:59:22Z

I'm against too.

NRGLine4Sec · 2021-11-12T16:04:32Z

I think that we have to be careful with hashing executables because it can be very annoying if we are asked for new connections after an update of one executables that we have already configured/allowed.

cypherbits · 2021-11-12T17:26:48Z

I think that we have to be careful with hashing executables because it can be very annoying if we are asked for new connections after an update of one executables that we have already configured/allowed.

That's why I say "optional". It can be a type of filtering of a rule. "By executable path" (like now) or "by executable hash"

dmknght · 2021-11-12T17:35:37Z

There could be a "flexible" solution. On Debian based system, binaries installed by apt has checksum at /var/lib/dpkg/info. For example, /usr/bin/apt has checksum at /var/lib/dpkg/info/apt.md5sums. Value from grep apt.md5sums:c78db41ce6a57d72bea6c7c4a9e52f25 usr/bin/apt. So at least, OpenSnitch can have System checksum to compare checksum with hash database.
Pros:

User doesn't have to config and reconfig checksum for all files. We can just add 1 option in settings Use system's checksum for executables
New versions of binaries have new checksums by default. No need to have big database and update database everytime.

Cons:

It is a text file. Malware runs as root can edit it
I didn't check on other families like redhat, gentoo
No checksums for custom packages / custom binaries. Overall, it is for binaries of packages on repository.

p/s:

I think that we have to be careful with hashing executables because it can be very annoying if we are asked for new connections after an update of one executables that we have already configured/allowed.

That was my point as well but at least with this solution we can have flexible solution for binaries installed by system.

cypherbits · 2021-11-12T21:25:17Z

I think checking the hash is more useful for user owned executables since for system binaries (apt installed) need root for installing and a malware would need to scalate privileges.

dmknght · 2021-11-13T05:22:23Z

Well i think for malware that has root permission / rootkit is an other scenario and it is more about AV / HIPS solution rather than firewall.
For regular binary file, we can have something like

Warn user the binary is untrusted (can't not find in system's checksum database nor opensnitch database)
For binaries installed by system, compare checksum with system's database, let say that apt as example. But the next problem is how to find checksum db of current binary? dpkg -S will search for packages contains binary path so we can craft full path to checksum file. But it is a process and it takes short time to get the value. And then parses md5 from db, calculate checksum for current file takes time and system resource. We don't want to slow everything down.

sl805 · 2021-11-15T10:18:08Z

Hi @sl805 , thank you for the kind words :)

I think that making requests to an external url is something that many users would be against with. I created PoC of your idea sometime ago and posted a screenshot somewhere. It's really easy to integrate VT, at least on the GUI, but we need consensus on these new features.

@gustavo-iniguez-goya
Well then it's even easier to implement. Instead of implementing VT integration, or any kind of reputation check. We can implement is as admission-hook! It'll work upon connection attempt, but instead of doing requests to external service:

Daemon will:

invoke user-defined script\binary while passing it parameters in certain format (or any other format):

{"pid": 1111, "path": "/tmp/something", "src_addr": "...", "src_port": "35000", "dst_addr": "example.someting.com", "dst_port": "12345"}
and expecting it to return one of following statuses:
-1 - unknown hook execution result. Hook execution failed, evaluation result is unknown due to some error
0 - positive hook execution result. Hook execution complete, evaluation result is positive\good
1 - negative hook execution result. Hook execution complete, evaluation result is negative\bad

UI will:

Wait for hook execution to complete for default_interval seconds
Show hook execution status with OK - green, FAILURE - red, if positive\negative hook result received. If it's no result received within default_interval or result is unknown, display YELLOW\GRAY
If user did not decided what to do in default_interval seconds, preform default action with hook execution status

In such case this feature becomes optional and will give people freedom to trigger any kind of automation they like:

pushing notifications
populating stop\allow user-defined lists
performing reputation checks
triggering process termination

This way user-defined code will be responsible for whatever user wants, and OpenSnitch will just invoke it and display status for user to be aware of it before taking final decision.

Best regards, sl805.

sl805 · 2022-01-11T17:29:11Z

@gustavo-iniguez-goya Happy New Year Gustavo. Is there any sense in my idea?

gustavo-iniguez-goya · 2022-01-17T20:51:50Z

hey @sl805 ,

I think that it's quite an elaborated proposal, thank you for taking the time to thing about it. I promise that I'm taking into account all suggestions and ideas. Reading your proposal, the first that comes to my mind is adding it as a plugin 📖

As I said above, for me the checksum is just another property of a binary, so I think I'll end up adding it. Later we can decide what to do with it. I like the idea of start small and simple, and keep adding complexity over time, so we could start with something like this:

Visible from the pop-up.
Configurable as a field of a rule, as proposed by @cypherbits .
Disabled by default, not to annoy some users as @NRGLine4Sec said.
With a tooltip: "If the checksum changes, prompt me to allow or deny the connection"

If the checksum changes, opensnitch will prompt you to allow or deny the connection (as it already does when a field doesn't match).

Regarding the @dmknght ' s idea of using apt checksums, it could be added as rule of type lists, like what we already do with domains, IPs, etc. The rule would load the checksums on the startup, and the user could decide what to do with those checksums: deny or allow connections that matches application's checksum.
But as s/he noted, it's something very specific to Debian based distros, so I think it's a perfect candidate to add as a plugin.

Adding this feature opens the possibility to create lists of IOCs, that could be used to block known malware/badware (for servers mostly), like we already do with domains, IPs, etc.

On the other hand, the user should be able to choose what checksum algorithm to use (md5, sha1, blake3, etc). Mainly because IOCs are created with md5/sha1 algorithm. So for example you could verify firefox's checksums with blake3, and deny everything that matches md5/sha1 checksums from a list.

NRGLine4Sec · 2022-01-18T09:51:16Z

Hi @gustavo-iniguez-goya,
There is some good suggestions in your post.
I particularly like this one :

Adding this feature opens the possibility to create lists of IOCs, that could be used to block known malware/badware (for servers mostly), like we already do with domains, IPs, etc.

To clarify my point of view on the subject a little, I am not against the idea of checking that the binary that we want to block and more particularly to allow is the one we want.
I am against the fact that this new functionality would automatically ask (by default) to authorize or refuse connections for a rule that we have already created but whose binary has changed, following an update for example.
But I would really like the feature to be able to "mark" the rules whose binaries have been changed since the creation of the rule, with a color (orange for example) on the rule in the OpenSnitch rules interface. It would then be necessary to have the possibility (probably with a right click on the rule) to validate that the change of binary is indeed voluntary (and thus to make the marking of the rule disappear).
This would avoid having the inconvenience of a new rule creation request when the binary is different while having the possibility of being alerted that it is different.
It would also make it possible to have no creation or change of rules other than a simple temporary marking (until we have a little time to investigate the rules concerned).

gustavo-iniguez-goya · 2022-01-18T14:38:02Z

I am against the fact that this new functionality would automatically ask (by default) to authorize or refuse connections for a rule that we have already created but whose binary has changed, following an update for example.

👍 understood.

But I would really like the feature to be able to "mark" the rules whose binaries have been changed since the creation of the rule

That's not possible right now, because we don't have an Alerts module. After you create a rule, it remains static for the rest of its lifetime and we can only apply 3 actions on it (allow, deny, ask) If a connection doesn't match with a rule, right now the only option is to ask the user to allow or deny it.

The alerts module could work just like the application rules, a list of rules with fields to check and actions to perform (send desktop notification, send email, colorize a row, etc). I thought about this necessity, because when a list of domains/IPs is reloaded because it has been updated, you don't get any feedback. It'd be cool to configure an alert to display a desktop notification saying: "Ads domains list updated".

dmknght · 2022-01-19T05:50:50Z

Regarding the @dmknght ' s idea of using apt checksums, it could be added as rule of type lists, like what we already do with domains, IPs, etc. The rule would load the checksums on the startup, and the user could decide what to do with those checksums: deny or allow connections that matches application's checksum.
But as s/he noted, it's something very specific to Debian based distros, so I think it's a perfect candidate to add as a plugin.

Hello! I checked the checksum db of dpkg and It doesn't have checksum for all binaries. By quick look, I can see it ignores same checksum for different file paths (like /usr/bin/bash and /bin/bash) but it is too soon to say it works.

gustavo-iniguez-goya · 2022-01-19T12:04:03Z

As far as I can tell on that particular case, on Sid bash is installed under /bin (dpkg -L bash): /bin/bash
Bu there'll be quirks for sure. Post any other particularity you see 👍

dmknght · 2022-01-19T15:49:57Z

As far as I can tell on that particular case, on Sid bash is installed under /bin (dpkg -L bash): /bin/bash Bu there'll be quirks for sure. Post any other particularity you see +1

Well bash is in both /bin/bash and /usr/bin/bash. The file in /usr/bin/ is not a symlink

From dpkg -S , it showed only /bin/bash but not /usr/bin/bash

I guess dpkg-query used list of files from .md5sums. As I remember there are some other cases that binary and its checksum are not in the .md5sums file. I have to check it again to make sure. Overall, the checksum db of dpkg is a promissing method it is not good enough.

dmknght · 2022-02-09T10:38:03Z

Hello! I played with the code for checksum from debian Db. Here is something I found:

Fedora doesn't have similar things. I didn't test on other platforms. It's possibly no checkum support or i couldn't find it.
For Parrot home edition, db is big with 235k objects (edit: I commented wrong number). For specific distros, the number could be bigger (up to 300k for Parrot Security?) or smaller (Debian only, maybe it has only 100k objects)
Performance is fast for Nim code. Comparison is fast but not enough IMO. It takes 57mb ram for object storage. The whole code takes 0.16 sec to parse db and compare binary info with database. My structure:
(Update: result with command time to show time and memory used). (Update 2: i set specific name and invalid checksum for object to compare to get the worst comparison result).

Runtime is different for different for different build flag in Nim. -d:danger gives fastest result. So performance is questionable for golang
To solve the problem i mentioned before, I do the loop to compare both checksum and absolute path
Checksum db wasn't made for this specific task. It's not ideal. It contains uneccesary checksum (changelog, manpage, ...)
Filter path to get only executable file's checksums is not ideal. We don't know if a package violate standard and put a script / binary file in weird path (say "/usr/share/docs/") -> Filter paths is not a good idea, but it reduced 20k objects for me.
This method only checks for executable file. So there are 2 scenarios can bypass it easily

python <script.py> always checks for python, or other interpreter in general. script.py, either from deb package or untrusted source, will be never checked unless we develop very complex code to parse and check.
Similar to 1, let say that program ping calls a function from static-lib.so. If static-lib.so was modified ans injected malware, we can't check it unless we do full analysis and scan.

dmknght · 2022-02-09T10:56:54Z

Update: To make program be more simple, I tried checking only md5sum instead of md5sum and file path. Code is sightly faster

However, again, we don't have similar solution for other distro families.

NRGLine4Sec · 2022-02-09T11:28:01Z

I think, it could be a lot simpler to don't have to checksum every binary the system could have..
Like you said, it will be difficult to find the same method as debian db for other distros, and this method will not works for applications installed outside of apt repository, like AppImage..
To my mind, we should only checksum binary used when we create the rule (first connection attempt) and keep it in opensnitch db with the fastest hash algorithm and consider to check the checksum each time the binary makes outside connections.

BobSquarePants · 2022-02-09T19:54:17Z

I agree with @NRGLine4Sec

and I will point to: https://en.wikipedia.org/wiki/Unix_philosophy

Make each program do one thing well.

gustavo-iniguez-goya · 2022-02-09T22:55:06Z

@dmknght you pointed out some interesting points 👍

python <script.py> always checks for python, or other interpreter in general. script.py, either from deb package or untrusted source, will be never checked unless we develop very complex code to parse and check.

This will be a problem if we add checksums. I didn't take it into account. This is also true for java, ruby, perl, bash, etc.

Similar to 1, let say that program ping calls a function from static-lib.so. If static-lib.so was modified ans injected malware, we can't check it unless we do full analysis and scan.

The attack vector is interesting. I just tried it and you're right, even creating a rule for a particular command (e.g.: telnet www.google.com 80), one could place a backdoor that used the allowed command (telnet ww.go... etc), and initiate a new connection to another host that would also be allowed (LD_PRELOAD=./backdoor.so telnet www.google.com 80, where backdoor.so opens a connection to attacker.com:9999)

However, if you filter by "command line" + host (and port, uid, etc), then we prompt the user to allow/deny the connection to the attacker's .domain/IP, because the properties of the connection doesn't match with any rule (verified ✔️ ).

Sometimes filtering by application + dest host is not practical, like with web browsers, but checking the checksum wouldn't help here either.

dmknght · 2022-02-10T00:36:17Z

I think, it could be a lot simpler to don't have to checksum every binary the system could have.. Like you said, it will be difficult to find the same method as debian db for other distros, and this method will not works for applications installed outside of apt repository, like AppImage.. To my mind, we should only checksum binary used when we create the rule (first connection attempt) and keep it in opensnitch db with the fastest hash algorithm and consider to check the checksum each time the binary makes outside connections.

Well I personally not a fan of the idea "OpenSnitch becomes HIPS". I prefer an application firewall only. And as I pointed out, either using system's checksum or custom checksum, there are methods can bypass this check.

However, if you filter by "command line" + host (and port, uid, etc), then we prompt the user to allow/deny the connection to the attacker's .domain/IP, because the properties of the connection doesn't match with a rule (verified heavy_check_mark ).

If you meant the cmdline parsing, then i think we have to do argument parser for all command lines, expect bug-free for so many different data input, and then analysis / parse all arguments. From my point of view right now, it takes so much time and research to make sure it works fine. That doesn't count attackers / malware could use some simple obfuscation methods to bypass this check:
echo <base64 payload> | base64 -d | bash -c
Or like this

So technically it works, but it's very easy to bypass and it takes a lot of time and resource to take care. And that didn't count different shell (fish, zsh, ..) could have different syntax.

gustavo-iniguez-goya · 2022-02-10T13:07:18Z

If you meant the cmdline parsing

No, sorry. I meant that if you have a rule that allow traffic initiated by telnet (for example), and someone backdoorizes telnet with a static-lib.so as you said to initiate a connection to a malware domain, one way of avoid that situation is by filtering by process_path + process arguments + destination host + destination port, i.e.: restrict to what IPs/domains/ports an application can connec to.

Similar to 1, let say that program ping calls a function from static-lib.so. If static-lib.so was modified ans injected malware, we can't check it unless we do full analysis and scan.

For example, instead of ping lets use telnet (or gnome-software, synaptic, whatever):

allow /usr/bin/telnet (any port, any domain, any IP)
- someone preloads static-lib.so to hide a connection to www.malware.com:80 -> allowed ✔️
allow /usr/bin/telnet to port 80 OR 443
- someone preloads static-lib.so to hide a connection to www.malware.com:80 -> allowed ✔️
allow /usr/bin/telnet to host www.domain.com
- someone preloads static-lib.so to hide a connection to www.malware.com:80 -> denied ❌ -> ask to allow/deny it (app displayed on the pop-up: /usr/bin/telnet)
allow /usr/bin/telnet to port 80 AND www.domain.com
- someone preloads static-lib.so to hide a connection to www.malware.com:44444 -> denied ❌ -> ask to allow/deny it (app displayed on the pop-up: /usr/bin/telnet)

gustavo-iniguez-goya · 2022-02-10T21:27:53Z

One way of using dpkg's db would be by generating a single file with all the known/installed checksums (md5 or whatever):

cat /var/lib/dpkg/info/*.md5sums > /tmp/dpkg-db.txt

Then you'd create a new rule: [x] Enabel, [x] Action: Allow, [x] List of md5sums: /tmp/dpkg-db.txt

Internally, if such rule is created and enabled, we would 1) get the md5 of an app that is trying to establish and outbound connection, 2) check the path + md5 against the DB.
It'd work very similar to the other type of lists that we already have.

The generated list could be autoupdated by adding a post-invoke script to apt: DPkg::Post-Invoke{"cat /var/lib/dpkg/info/*.md5sums > /tmp/dpkg-db.txt"; };

On the other hand, a user could also create a DB of checksums with the help of a simple script:
find /bin /sbin /usr/bin /usr/sbin -exec md5sum {} \; > /tmp/custom-db.txt

This db would work on any system, regardless of the package manager, and could also be integrated with apt and probably others package managers, or be added to a cron task.

dmknght · 2022-02-12T13:19:19Z

check the path + md5 against the DB.

I think no need to check both path and md5. Only md5 is enough. For example cp /usr/bin/nc /tmp/copied-nc-file has the same md5 so we can use the nc rules to treat it

This db would work on any system, regardless of the package manager, and could also be integrated with apt and probably others package managers, or be added to a cron task.

I think it technically works but with issues:

Each OS could have different build platform. That means binary checksum could be different depends on compiler information / metadata.
Different version of package on each distro. Ofc! Can't be something else for checksum db :D
Custom patches of maintainers. IDK about other platforms, but on Debian, some packages have small patches to fix specific issues or just change something. It also brings us different checksum for same package with same version.

Because you mentioned about .txt file for checksum, i'm wondering is it faster than load everything into memory and compare it in memory? Maybe save all sums in a cache file, and then use iterator to compare each line is better for memory usage but performance is questionable.

gustavo-iniguez-goya · 2022-02-15T16:56:43Z

I think it technically works but with issues:

...

...

...

The "db" (a list of md5 checksums + paths) would be generated on each system with a script, via cron or similar, so the list would be specific to that installation.

Because you mentioned about .txt file for checksum, i'm wondering is it faster than load everything into memory and compare it in memory? Maybe save all sums in a cache file, and then use iterator to compare each line is better for memory usage but performance is questionable.

Nope. the .txt file would be loaded into memory, just like what we do with domains lists. A map of key: hash, value: path

dmknght · 2022-02-16T18:30:01Z

The "db" (a list of md5 checksums + paths) would be generated on each system with a script, via cron or similar, so the list would be specific to that installation.

So I'm thinking about "hook scripts". APT has config to run a script after user install any package. I don't really know much about this. But this is a good method to reload all checksums to memory.

gustavo-iniguez-goya · 2022-04-25T09:55:49Z

I've got a PoC working. Some benchmarks:

md5: 497.725253ms, /usr/lib/chromium/chromium, hash: fc5b1708gh7a0a848bfc5f477bb9d39b
blake3: 362.768448ms, /usr/lib/chromium/chromium, hash: af1349b9f5f9a1a6a0111dea36dcc9499bcb25c9adc112b7cc9a93cae41f3262

That's a lot of time even for a binary of ~180MB. The bottleneck is not the hash algorithm, but the reading of the file.

[edit] Computing chunks of a file (1kb, 4kb, 1mb...) helps to reduce reading times down to µs.

With this PoC you get alerted if the checksum differs. It can be due to a malicious activity, an update of the binary or if a process is launched from a different namespace/container with the same path + name.
There should be a graphical warning stating that the checksum changed.

Regarding the concerns about if this functionality is part of a firewall:
We need it to prevent some ways of bypassing the rules (with mount namespace + overlayfs for example). In the end, what we need is more control on processes launched from containers/namespaces. For example: you may allowed a process launched as UID 1000, but if it's sandboxed (running inside an unprivileged container), the UID may be different.

Some notes:

We'll have to deal with packages updates, and how to try not to annoy the users.
The only feasible way I see is by making it optional.

On the other hand, when a binary is updated while it's running, every time we'll get the checksum it'll be different from the one loaded in memory. The user should accept the new connection, and if it has matched a rule, the corresponding rule should be modified.
We should think about what to do with regexp rules. For example: process.Path: ^(/usr/sbin/ntpd|/usr/bin/xbrlapi|/usr/bin/dirmngr)$ we can't hash the value of process.Path directly.
exclude these rules from calculating hashes? As a starting point sounds reasonable.
if the binary that is opening connections is containerized, we should hash the binary inside the container, nor the outer equivalent (this is, the path we receive from the kernel).

wirespecter · 2022-04-28T14:28:46Z

@gustavo-iniguez-goya Thanks for working on this. I agree with all your points above.

dmknght · 2022-08-10T19:47:07Z

@gustavo-iniguez-goya There's "not very famous" EDR called Falco. It uses eBPF to catch syscall. It has so many features so I'm thinking what if there's an application firewall based on this EDR. I think with rules or custom plugins, developers can create a opensnitch-like UI and ofc calculate checksum is possible. What do you think about it?

gustavo-iniguez-goya · 2022-08-11T23:34:31Z

hey @dmknght , I think that type of software is actually really interesting: falco, tracee, redcanary, tetragon, ehids, ... (there're a lot). But I think they're more server oriented, so I don't know how easy would be to add a desktop GUI, or if it's worth the effort.

Regarding this feature in particular, if someone is willing to test it and give back some feedback, I can publish a branch with the feature. There're some quirks and corner cases (that may c, but all in all it works fairly well.

dmknght · 2022-08-13T17:32:15Z

hey @dmknght , I think that type of software is actually really interesting: falco, tracee, redcanary, tetragon, ehids, ... (there're a lot). But I think they're more server oriented, so I don't know how easy would be to add a desktop GUI, or if it's worth the effort.

Regarding this feature in particular, if someone is willing to test it and give back some feedback, I can publish a branch with the feature. There're some quirks and corner cases (that may c, but all in all it works fairly well.

I'm not sure about the other but the falco can run standalone in the system. It has grpcio channel so a in theory anybody can write an applet or something like that to get the alert logs. I don't really know about "hold" the process and network connection (to make it be a firewall).

aitorpazos · 2023-07-21T12:04:36Z

I agree with the point of delegating this problem (asses if the binary is malicious or not) to a different tool, but being in the flow of establishing new connections and having the UI notification mechanism in place gives OpenSnitch a great position to alert users on this.

What about supporting delegating the assessment of the binary to a separate tool?

graph LR
    newCon(["New connection request"]) --> hashChanged{"Binary hash changed"}
    hashChanged -- Yes --> scanBin[["Call binary scan"]]
    scanBin --> binSuspicious{"Binary is suspicious"}
    binSuspicious -- Yes --> alertUser[["Alert User"]]
    hashChanged -- No --> continue(["Continue"])
    binSuspicious -- No --> continue
    alertUser --> userAccepts{"User accepts risk"}
    userAccepts -- Yes --> continue
    userAccepts -- No --> block(["Block"])

What I am not clear is how it could be user-friendly setup-wise.

gustavo-iniguez-goya · 2023-08-04T10:38:46Z

Some updates on this issue.

I've got this feature working and I'm testing and improving it. Hopefully I expect to add it after v1.6.2.
Now that we intercept exec events, the scenario has changed a little bit for the better:

We'll have to deal with packages updates, and how to try not to annoy the users.
The only feasible way I see is by making it optional.

Working on it. Probably processes' hashing will be enable only if there's any rule that requires it. Still thinking about it.

On the other hand, when a binary is updated while it's running, every time we'll get the checksum it'll be different from the one loaded in memory. The user should accept the new connection, and if it has matched a rule, the corresponding rule should be modified.

Now the checksum is not recalculated every time a new outbound connection is about to be established. It's cached for the life of the process.

We should think about what to do with regexp rules. For example: process.Path: ^(/usr/sbin/ntpd|/usr/bin/xbrlapi|/usr/bin/dirmngr)$ we can't hash the value of process.Path directly.
exclude these rules from calculating hashes? As a starting point sounds reasonable.

With lists of checksums, like the lists we have for domains or IPs, these rules could work by adding the checksums of each path in the regexp.

if the binary that is opening connections is containerized, we should hash the binary inside the container, nor the outer equivalent (this is, the path we receive from the kernel).

Done. And added a workaround for AppImages, which are a little bit special.

For a simple usage of a Linux desktop (office suites, image editing, browse the Internet, reading emails...) , it looks like it doesn't penalize much the user experience, but we'll see how it behaves on different machines.

On the other hand, the work on this feature will unlock new features, like displaying the process' parents that executed the process or filtering by parent PID.

Now you can create rules to filter processes by checksum. Only md5 is available at the moment. There's a global configuration option that you can use to enable or disable this feature, from the config file or from the Preferences dialog. As part of this feature there have been more changes: - New proc monitor method (PROCESS CONNECTOR) that listens for exec/exit events from the kernel. This feature depends on CONFIG_PROC_EVENTS kernel option. - Only one cache of active processes for ebpf and proc monitor methods. More info and details: #413.

gustavo-iniguez-goya · 2023-09-22T00:14:22Z

Added: 7a9bb17

This commit is only one piece of the puzzle, it's WIP.
TODO list:

Display a visible warning to indicate that the checksum changed, and that's the reason of the new pop-up of an already allowed/denied process.
Allow to update a rule directly from the pop-ups.
Display the checksum on the Process dialog.
Show the path of the process' parents on the pop-up + Process dialog.
Allow to use list of hashes (md5, sha1, ...). For example to use dpkg's checksums from /var/lib/dpkg/info/*.md5
Add more hashing algorithms.
Obtain correctly the checksum of AppImages to match the one on disk.
Add CONFIG_PROC_EVENTS to the requirements check.
Extract the hashing functionality from the cache of exec events to its own module.

As part of this feature there has been added new functionality and code reorganization:

New proc monitor to intercept exec/exit events from kernel by using the Kernel Proc Connector](https://lwn.net/Articles/153694/) , https://github.com/vishvananda/netlink/blob/main/proc_event_linux.go.
By default we try to use eBPF for this, but if the monitor method is "proc" or the eBPF module opensnitch-procs.o doesn't work (due to kernel restrictions, liquorix), then we'll fallback to this feature.
On the one hand now we don't need to poll active processes every second, we receive the events asynchronously.
If the monitor method is "proc", now we should be able to intercept short-live-processes easily, which was not possible before.
On the other hand, under heavy load such as when creating a initramfs or compiling large source code bases you may notice an increase in CPU usage.
Only one cache of active processes for all the monitor methods, ebpf or proc.

Some caveats:

When you allow/deny a connection filtering by checksum, and the process's checksum changes (e.g., due to an update), you'll be prompted to allow/deny it again. There's no visual indication that the checksum has changed, yet.
- For now, you'll have to copy the new checksum, and update the rule manually...
Only the interpreters are hashed (python, perl, bash), not the scripts.

Peculiarities:

The first path that's hashed is the link to the binary (/proc/<pid>/exe). After updating a binary that's already running, the new binary on disk differs from the one that's running on memory, so the checksums do not match. However, the checksum you allowed is valid until you restart the process.
Hashing the binary path gives us the checksum of the binary on disk, not in memory.
The second path is the "real path" of the process, this is, the absolute path of the process taking its root fs into account.
For example, we receive from kernel exec events like /usr/bin/curl, but if the binary is executed from a different mount namespace (containerized), the path may not exist. In these scenarios the real path to the binary is /proc/<pid>/root/ + /usr/bin/curl
AppImages' directories can't be read even with root privileges, due to being a fuser mount.
For these cases, a temporary hash is calculated, which doesn't match the one on disk. It persists across executions.

Notes:

Due to certain requirements, now we obtain the parent of a process. As a result, displaying the entire chain of processes that led to the opening of a connection should be straightforward.
Filtering by children should now be simpler.

- Obtain the process's parent hierarchy. - Display the hierarchy on the pop-ups and the process dialog. - [pop-ups] Added a Detailed view with all the metadata of the process. - [cache-events] Improved the cache of processes. - [ruleseditor] Fixed enabling md5 checksum widget. Related: #413, #406

gustavo-iniguez-goya · 2023-11-18T00:09:09Z

An update on this issue: in general it's working fine. I've only observed a problem with apps that spawn other processes (firejail for example). If you don't know what's going on, the visual warning stating that the checksum has changed can be a bit confusing.

On the other hand, is a bit annoying having to update a rule manually whenever a checksum changes due to an update. Not much, but I guess if you're on a rolling release distro it will be a bit more frustrating. WIP to impove user experience of that scenario.

And some interesting notes about this issue after watching some Linux Plumbers '23 talks.

There're some technologies that we could investigate to replace or complement the current approach (calculate checksums on-demand):

IMA (Integrity Measurement Architecture): https://sourceforge.net/p/linux-ima/wiki/Home/#enabling-ima-appraisal
- Once enabled, it exposes under /sys/kernel/security/ima a list of files with the hashes.
- The hashes are also available from kernel space.
- Fedora signed with IMA every file released with their rpms on >= f37: https://fedoraproject.org/wiki/Changes/Signed_RPM_Contents
fsverity
DIGLIM, provides a repository of file digests from authenticated sources, such as RPM headers

More info:

https://lwn.net/Articles/753276/
https://events19.linuxfoundation.org/wp-content/uploads/2017/12/LSS2018-EU-LinuxIntegrityOverview_Mimi-Zohar.pdf
http://downloads.sf.net/project/linux-ima/linux-ima/Integrity_overview.pdf

When the checksum of a binary changes, due to an update or something else, you'll be prompted to allow the outbound connection if the previous checksum of the rule doesn't match the new one. Without a visual warning was almost impossible to know what was going on. Besides, you had to dismiss that pop-up, find the rule, and update the checksum. Now there's a warning message, and you can update the rule from the pop-up. Related: #413

- Fixed several leaks. - Cache of events reorganized and improved. * items are added faster. * proc details are rebuilt if needed (checksums, proc tree, etc) * proc's tree is reused if we've got the parent in cache. rel: #413

Logicwax · 2024-01-12T21:41:43Z

Hopefully this feature gets into the next release that's pushed to the repos soon. Looking forward to it, especially for user-owned binaries!

gustavo-iniguez-goya mentioned this issue Nov 11, 2021

[Feature Request] Optional executable verification by hash or last modification date #545

Closed

gustavo-iniguez-goya mentioned this issue Dec 25, 2022

Problem with AppImages #408

Closed

gustavo-iniguez-goya added this to the 1.7.0 milestone Mar 10, 2023

gustavo-iniguez-goya modified the milestones: 1.7.0, 1.6.3 Aug 2, 2023

[Feature Request] Calculate a checksum for each whitelisted executable #413

[Feature Request] Calculate a checksum for each whitelisted executable #413

Comments

wirespecter commented May 13, 2021 • edited Loading

gustavo-iniguez-goya commented May 13, 2021

wirespecter commented May 14, 2021

dmknght commented May 17, 2021

NRGLine4Sec commented May 18, 2021

gustavo-iniguez-goya commented May 18, 2021 • edited Loading

dmknght commented May 19, 2021

elesiuta commented Aug 30, 2021 • edited Loading

sl805 commented Nov 3, 2021

gustavo-iniguez-goya commented Nov 11, 2021

cypherbits commented Nov 12, 2021

NRGLine4Sec commented Nov 12, 2021

NRGLine4Sec commented Nov 12, 2021 • edited Loading

cypherbits commented Nov 12, 2021

dmknght commented Nov 12, 2021 • edited Loading

cypherbits commented Nov 12, 2021

dmknght commented Nov 13, 2021

sl805 commented Nov 15, 2021 • edited Loading

sl805 commented Jan 11, 2022

gustavo-iniguez-goya commented Jan 17, 2022 • edited Loading

NRGLine4Sec commented Jan 18, 2022

gustavo-iniguez-goya commented Jan 18, 2022

dmknght commented Jan 19, 2022

gustavo-iniguez-goya commented Jan 19, 2022

dmknght commented Jan 19, 2022

dmknght commented Feb 9, 2022 • edited Loading

dmknght commented Feb 9, 2022 • edited Loading

NRGLine4Sec commented Feb 9, 2022

BobSquarePants commented Feb 9, 2022

gustavo-iniguez-goya commented Feb 9, 2022 • edited Loading

dmknght commented Feb 10, 2022 • edited Loading

gustavo-iniguez-goya commented Feb 10, 2022

gustavo-iniguez-goya commented Feb 10, 2022 • edited Loading

dmknght commented Feb 12, 2022

gustavo-iniguez-goya commented Feb 15, 2022

dmknght commented Feb 16, 2022

gustavo-iniguez-goya commented Apr 25, 2022 • edited Loading

wirespecter commented Apr 28, 2022

dmknght commented Aug 10, 2022

gustavo-iniguez-goya commented Aug 11, 2022

dmknght commented Aug 13, 2022

aitorpazos commented Jul 21, 2023 • edited Loading

gustavo-iniguez-goya commented Aug 4, 2023 • edited Loading

gustavo-iniguez-goya commented Sep 22, 2023 • edited Loading

gustavo-iniguez-goya commented Nov 18, 2023 • edited Loading

Logicwax commented Jan 12, 2024

wirespecter commented May 13, 2021 •

edited

Loading

gustavo-iniguez-goya commented May 18, 2021 •

edited

Loading

elesiuta commented Aug 30, 2021 •

edited

Loading

NRGLine4Sec commented Nov 12, 2021 •

edited

Loading

dmknght commented Nov 12, 2021 •

edited

Loading

sl805 commented Nov 15, 2021 •

edited

Loading

gustavo-iniguez-goya commented Jan 17, 2022 •

edited

Loading

dmknght commented Feb 9, 2022 •

edited

Loading

dmknght commented Feb 9, 2022 •

edited

Loading

gustavo-iniguez-goya commented Feb 9, 2022 •

edited

Loading

dmknght commented Feb 10, 2022 •

edited

Loading

gustavo-iniguez-goya commented Feb 10, 2022 •

edited

Loading

gustavo-iniguez-goya commented Apr 25, 2022 •

edited

Loading

aitorpazos commented Jul 21, 2023 •

edited

Loading

gustavo-iniguez-goya commented Aug 4, 2023 •

edited

Loading

gustavo-iniguez-goya commented Sep 22, 2023 •

edited

Loading

gustavo-iniguez-goya commented Nov 18, 2023 •

edited

Loading