-
-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"take-2": new Windows backend, ctypes-based, separate class which shards via Attributes #555
Conversation
@jaraco any feedback on this rewrite? |
if existing_pw: | ||
# resave the existing password using a compound target | ||
existing_username = existing_pw['UserName'] | ||
target = self._compound_name(existing_username, service) | ||
self._set_password( | ||
target, | ||
existing_username, | ||
existing_pw.value, | ||
) | ||
self._set_password(service, username, str(password)) | ||
|
||
def _set_password(self, target, username, password): | ||
# resave the existing password using a compound target | ||
# Fixes part of https://github.com/jaraco/keyring/issues/545, | ||
# but get_credentials also needs to be fixed to search in the same | ||
# order as get_password. | ||
if existing_username != username: | ||
target = self._compound_name(existing_username, service) | ||
self._set_password( | ||
target, | ||
existing_username, | ||
existing_pw.value, | ||
encoding=encoding, | ||
) | ||
self._set_password(service, username, str(password), encoding=encoding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by this comment. It's saying that by re-saving the existing password using a compound target, it fixes part of 545, yet re-saving the existing password was already part of the prior change. Moreover, "get_credentials also needs to be fixed" feels like a misplaced TODO. Additionally, the comment doesn't say anything about what the code is meant to do, only has a reference to a reported issue.
By reading the issue, I now think the comment is only about the "if existing_username" check. The problem with putting the comment here is it becomes interleaved with the other comment, making it difficult to tell what comment goes with which parts of code. Since this change adds a whole new branch of code, it may be time to consider a refactor.
At the very least, I'd move the "resave" comment back to where it was (at the beginning of the if existing_pw
branch and then move the rest of the comment into the if existing_username
branch and rephrase to indicate its essential purpose, something like, "Only set the compound target when usernames match. Ref #545."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
return credential | ||
|
||
|
||
def CredReadFromAttributes(Type, TargetName, Credential=None, encoding='utf-8'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I'd probably name this using PEP 8 conventions. The reason for using the capital camel case is to match the names/parameters of the Windows API. In this case, the functionality is not part of the Windows API, but is a helper function, so probably should follow the naming of other helper functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I was writing it, I was trying to imagine the function I wish was already in the API, and wrote the one I wish was there. But that's all inside my head. I agree that it should be named as a helper, not as a faux member of the real API.
Credential['CredentialBlob'] = accum.decode(encoding) | ||
Credential['CredentialBlobSize'] = len(Credential['CredentialBlob']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a little odd to me to be utilizing the API structures to embed the altered values, values that wouldn't be allowed in the structure. I think I'd rather see a different approach where the encoding/decoding strategy sits as a layer between the Windows API and the keyring API, and less intertwined with the Windows API.
I can provide an example later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed (see my previous comment for how I got there.)
def tearDown(self): | ||
# clean up any credentials created | ||
for cred in self.credentials_created: | ||
try: | ||
self.keyring.delete_password(*cred) | ||
except Exception as e: | ||
print(e, file=sys.stderr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what's going on here. This functionality seems to have been removed without justification. This code is there for a reason, preventing the pollution of a user's own keychain when running tests. Why was it removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I ran my tests, I noticed that this function wasn't getting called. Furthermore, the cleanup
function in the underlying BackendBasicTests
was getting called (seemingly accomplishing the same goal, although without the try...except
.
Looking at the pytest
documentation, I see that there is an optional teardown
(lowercase "d"
) method that a test class can provide. My guess is that this function was intended to be named teardown
and was mistakenly named tearDown
, which made it invisible to pytest
.
Since the base class BackendBasicTests
cleanup
seemed to clean up all the unit test password in my run, I don't know that we need this one.
I should have documented why I yanked it -- sorry about that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this is looking really good. I quite like the direction it's gone and it's homing in on a sound solution.
I still worry there's just too much going on. This one change affects several different behaviors (introducing new keyring, switch to local ctypes API, changes existing pw checks, introduce encoding, introduces tests for broken encoding, ...). Although I think the final implementation may look very much like what you have here, it's just too much changing for one commit.
I'd much rather see this change as a series of incremental changes, each addressing more selective concerns, either as separate, independent PRs or as separate commits if very closely related.
For example: PR1 could implement simply switching from pywin32ctypes to a native ctypes implementation. That change could be reviewed, accepted, and released so that its effect could be proven and any unexpected regressions could be addressed before accepting more changes that depend on that change. Then PR2 could implement the sharding strategy (based on PR1). PR3 could fix the existing password checks and be independent of PR1 and PR3.
How would you feel about that approach?
I like the idea of a phased set of pull requests. The only drawback is that I am inexperienced both with git and with tox etc on Linux. So I am getting bogged down in the mechanics. I could really use some instructions as to how to install on Ubuntu so I can test. Also, I don't know how to create new pull requests into my account without clobbering the existing cloned repo. Other than these mechanical issues, I'd be happy to break it all apart into 3 PRs. Maybe you or someone else could undertake PR3 (and fix the typo with [EDIT]
to |
For me, the easiest way to test on Ubuntu is to download Docker. I have a "tox-multipy" image I use to test Python applications. From the keyring source folder:
Oh, that must be really frustrating. I'd recommend to get a nice repo Gui like VSCode or Sourcetree or PyCharm or Sublime Merge (maybe in that order) and use that to manage your repo. It will make it easier to discover operations (like creating a branch and pushing it to your Github repo) and also visualize your changes. I'd also recommend reading through some of the tutorial resources on using Github, which talk about creating branches in Github. For your current code, I'd recommend to create a branch for it.
Then reset your main to match upstream and reset to that state.
Note, you may or may not have a remote called "upstream". If not, you may need to create one first.
I hope that helps. I would recommend spending some time getting familiar with branching and submitting PRs from branches as that will make your life immensely easier when it comes to contributing these changes. Thanks for your patience and perseverance on this effort! |
Closing as stale. Feel free to revive this effort at any time. |
@jaraco, here's my fresh effort to implement an improved Windows backend, to the plan you outlined:
Re 1, I contacted pywin32-ctypes, and heard back that it would take a week or two to hear back from them. I wrote the code that I would like to have from that project. It is in
keyring/backends/Windows/api.py
. I implemented it, this time, using onlyctypes
, and notcffi
.Re 2, I moved the existing
backends/Windows.py
tobackends/Windows/__init__.py
, and ported it to useapi.py
.Re 3, I implemented
class WinVaultAttributesKeyring(WinVaultKeyring)
to isolate the new sharding logic.In order to study use-cases of
Credential Manager
, I implementedCredEnumerate
, which let me more readily dump out all the credentials in myCredential Manager
. I learned that, apparently, Microsoft had the same issue we have, in needing more room for JWTs.Their sharding solution uses Attributes, rather than creating multiple entries. I liked that much better, and so the sharding logic in this PR uses Attributes. I'm including a redacted version of that dump here, which includes the first few bytes of
CredentialBlob
from each entry. I sure hope I redacted enough!!!redacted_credentials2.txt
While the introduction of the use of Attributes would be new to
keyring
, I restrict their use to the "password won't fit and we raise1783
error" case. Passwords that fit are treated the same as in shippingkeyring
, so the risk introduced should be small.I added a test illustrating the problem raised in #554.
def test_read_even_length_utf8_password(self)
I also fixed the extra duplication of entries raised in #545. But I didn't know how to fix
get_credentials
, per the observation you made.What's left?:
I don't understand the
priority
system 100%, and need some guidance there. But I did add some tests explicitly targeting the new sharding backend.I'd like to resolve the
get_credentials
part of thecompound_name
issue. I don't understand the algorithm well enough to fix it myself. In general, the idea that I callset_password
with oneservice
, and, later, the system renames it, yet transparently finds it for me, is pretty confusing to me. I would benefit from more info about what problem is being solved. Is it that thekeyring
model is that only the(service, username)
is meant to be unique?My credentials file shows that only
keyring
users (and presumably otherpywin32
orpywin32-ctypes
users) are usingUTF-16
. Many of the credentials appear to be various 8-bit encodings, and typically do not appear to be Unicode encodings at all. If myCredential Manager
data is indicative, I don't think we are helping interoperability by imposingUTF-16
. Rather, I think we simply inherited this behavior frompywin32
. I'd love to discuss what we can do about this, without disrupting existing users.