-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High entropy string not being detected (by default) #323
Comments
@cburton-godaddy, this is expected behavior for the current implementation. The trick is that tartufo only looks at the entropy of certain strings, namely those that appear to be hexadecimal or base64 encodings. Because your sample string contains many punctuation characters that aren't used by any of those encodings, it is broken up into lots of tiny strings that might be valid base64 or hex encodings, but all of them are so short they are discarded without examination. We've thought in the past about adding support for other less-common encodings (such as the password hashes typically found in Similarly, if you wanted to spot sequences like your example above, instead of entropy I'd consider a regex that matches your alphabet (just about everything except whitespace, it appears) and alarm if it's detected (without considering its entropy). This is somewhat guided by how you might encounter such a string in your code -- would it be delimited (by quotation marks, etc.) or on a line by itself in input data (i.e. delimited by beginning-of-line and end-of-line)? Performing unconstrained entropy detection would be a major change, because we wouldn't be able to spot high-entropy sequences without doing some sort of rolling scan (basically considering every substring of length >= 20 in the entire file). Assuming you aren't just kicking tartufo to see how it screams :) I'd be interested in a use case or example that we could use as a basis for more concrete and considered evaluation of gaps in the present implementation. In the meantime, I consider this report a bug only to the extent that it's a "documentation bug" -- the general behavior hasn't changed since before the truffleHog fork, but I reviewed the description on tartufo.readthedocs.io and I agree the paragraph on entropy scanning is misleadingly vague. |
Hi @rbailey-godaddy, thanks for the quick reply. Your logic makes sense, after I posted this issue I discussed with team and I think we came to the same conclusion that there must be some heuristic you are using to find strings to analyze (the base64/hex encoding), otherwise you would have to do a rolling scan as you said, which I presume would be much more expensive. As a consumer of the package, I would like to see Tartufo handle thsese strings, as they are high entropy and could 100% exist in codebases. However I understand your reasoning that this isn't a simple job and apprecicate the thought and work that has already been put into the package. The reason I came across this as I was demonstrating our pre commit and build integrations and went to go grab a random high entropy string to test with and encountered this problem. I also agree that updating the documentation would be great, as this tool was described to us (interally) as a solution to preventing secrets in a repo, a more in-depth description of how the tool works would have been greatly beneficial. Thanks for your time writing this out! Feel free to close this issue if you think that is appropriate. |
🐛 Bug Report
I was trying strings generated by this online tool and could not understand why it wasn't working. https://passwordsgenerator.net/
For example, this string
qdKy]^$6bc!hU}Z;VRURf=j9F-}6~Xh~ykM.bKxfHdw_z6=Ura2Z".?!U!;;Kxt!Rw(g>"/Fw~e%gF'4)=\[!c>jbB;Kg=8V%\
is not classed as high entropy to TartufoTo Reproduce
git init
git add .
git commit -m "initial commit"
git add .
docker run --rm -v "$(pwd):/git" godaddy/tartufo pre-commit
Expected Behavior
Should report strings like the above as high entropy by default
Code Example
Environment
Using Docker so environment isn't relevant.
The text was updated successfully, but these errors were encountered: