"take-2": new Windows backend, ctypes-based, separate class which shards via Attributes #555

jrobbins-LiveData · 2021-12-05T18:40:38Z

@jaraco, here's my fresh effort to implement an improved Windows backend, to the plan you outlined:

Determine if pywin32-ctypes can support the necessary interfaces herein and at least capture any shortcomings.
Port the existing backend to an API that supports the necessary interfaces (perhaps an updated version of pywin32-ctypes or maybe a bespoke implementation). Cut a release to get early feedback on any regressions related to that change.
Implement the ShardedKeyring as a specialized subclass or mix-in, encapsulating the sharding logic in one class.

Re 1, I contacted pywin32-ctypes, and heard back that it would take a week or two to hear back from them. I wrote the code that I would like to have from that project. It is in keyring/backends/Windows/api.py. I implemented it, this time, using only ctypes, and not cffi.

Re 2, I moved the existing backends/Windows.py to backends/Windows/__init__.py, and ported it to use api.py.

Re 3, I implemented class WinVaultAttributesKeyring(WinVaultKeyring) to isolate the new sharding logic.

In order to study use-cases of Credential Manager, I implemented CredEnumerate, which let me more readily dump out all the credentials in my Credential Manager. I learned that, apparently, Microsoft had the same issue we have, in needing more room for JWTs.

Their sharding solution uses Attributes, rather than creating multiple entries. I liked that much better, and so the sharding logic in this PR uses Attributes. I'm including a redacted version of that dump here, which includes the first few bytes of CredentialBlob from each entry. I sure hope I redacted enough!!!

redacted_credentials2.txt

While the introduction of the use of Attributes would be new to keyring, I restrict their use to the "password won't fit and we raise 1783 error" case. Passwords that fit are treated the same as in shipping keyring, so the risk introduced should be small.

I added a test illustrating the problem raised in #554. def test_read_even_length_utf8_password(self)

I also fixed the extra duplication of entries raised in #545. But I didn't know how to fix get_credentials, per the observation you made.

What's left?:
I don't understand the priority system 100%, and need some guidance there. But I did add some tests explicitly targeting the new sharding backend.

I'd like to resolve the get_credentials part of the compound_name issue. I don't understand the algorithm well enough to fix it myself. In general, the idea that I call set_password with one service, and, later, the system renames it, yet transparently finds it for me, is pretty confusing to me. I would benefit from more info about what problem is being solved. Is it that the keyring model is that only the (service, username) is meant to be unique?

My credentials file shows that only keyring users (and presumably other pywin32 or pywin32-ctypes users) are using UTF-16. Many of the credentials appear to be various 8-bit encodings, and typically do not appear to be Unicode encodings at all. If my Credential Manager data is indicative, I don't think we are helping interoperability by imposing UTF-16. Rather, I think we simply inherited this behavior from pywin32. I'd love to discuss what we can do about this, without disrupting existing users.

jrobbins-LiveData · 2021-12-22T01:09:37Z

@jaraco any feedback on this rewrite?

jaraco · 2022-01-02T18:09:17Z

keyring/backends/Windows/__init__.py

        if existing_pw:
-            # resave the existing password using a compound target
            existing_username = existing_pw['UserName']
-            target = self._compound_name(existing_username, service)
-            self._set_password(
-                target,
-                existing_username,
-                existing_pw.value,
-            )
-        self._set_password(service, username, str(password))
-
-    def _set_password(self, target, username, password):
+            # resave the existing password using a compound target
+            # Fixes part of https://github.com/jaraco/keyring/issues/545,
+            # but get_credentials also needs to be fixed to search in the same
+            # order as get_password.
+            if existing_username != username:
+                target = self._compound_name(existing_username, service)
+                self._set_password(
+                    target,
+                    existing_username,
+                    existing_pw.value,
+                    encoding=encoding,
+                )
+        self._set_password(service, username, str(password), encoding=encoding)


I'm confused by this comment. It's saying that by re-saving the existing password using a compound target, it fixes part of 545, yet re-saving the existing password was already part of the prior change. Moreover, "get_credentials also needs to be fixed" feels like a misplaced TODO. Additionally, the comment doesn't say anything about what the code is meant to do, only has a reference to a reported issue.

By reading the issue, I now think the comment is only about the "if existing_username" check. The problem with putting the comment here is it becomes interleaved with the other comment, making it difficult to tell what comment goes with which parts of code. Since this change adds a whole new branch of code, it may be time to consider a refactor.

At the very least, I'd move the "resave" comment back to where it was (at the beginning of the if existing_pw branch and then move the rest of the comment into the if existing_username branch and rephrase to indicate its essential purpose, something like, "Only set the compound target when usernames match. Ref #545."

jaraco · 2022-01-02T18:13:27Z

keyring/backends/Windows/api.py

+    return credential
+
+
+def CredReadFromAttributes(Type, TargetName, Credential=None, encoding='utf-8'):


nitpick: I'd probably name this using PEP 8 conventions. The reason for using the capital camel case is to match the names/parameters of the Windows API. In this case, the functionality is not part of the Windows API, but is a helper function, so probably should follow the naming of other helper functions.

When I was writing it, I was trying to imagine the function I wish was already in the API, and wrote the one I wish was there. But that's all inside my head. I agree that it should be named as a helper, not as a faux member of the real API.

jaraco · 2022-01-02T18:19:23Z

keyring/backends/Windows/api.py

+        Credential['CredentialBlob'] = accum.decode(encoding)
+        Credential['CredentialBlobSize'] = len(Credential['CredentialBlob'])


It seems a little odd to me to be utilizing the API structures to embed the altered values, values that wouldn't be allowed in the structure. I think I'd rather see a different approach where the encoding/decoding strategy sits as a layer between the Windows API and the keyring API, and less intertwined with the Windows API.

I can provide an example later.

Agreed (see my previous comment for how I got there.)

jaraco · 2022-01-02T18:24:24Z

tests/backends/test_Windows.py

-    def tearDown(self):
-        # clean up any credentials created
-        for cred in self.credentials_created:
-            try:
-                self.keyring.delete_password(*cred)
-            except Exception as e:
-                print(e, file=sys.stderr)


I don't understand what's going on here. This functionality seems to have been removed without justification. This code is there for a reason, preventing the pollution of a user's own keychain when running tests. Why was it removed?

When I ran my tests, I noticed that this function wasn't getting called. Furthermore, the cleanup function in the underlying BackendBasicTests was getting called (seemingly accomplishing the same goal, although without the try...except.

Looking at the pytest documentation, I see that there is an optional teardown (lowercase "d") method that a test class can provide. My guess is that this function was intended to be named teardown and was mistakenly named tearDown, which made it invisible to pytest.

Since the base class BackendBasicTests cleanup seemed to clean up all the unit test password in my run, I don't know that we need this one.

I should have documented why I yanked it -- sorry about that!

jaraco

Overall, this is looking really good. I quite like the direction it's gone and it's homing in on a sound solution.

I still worry there's just too much going on. This one change affects several different behaviors (introducing new keyring, switch to local ctypes API, changes existing pw checks, introduce encoding, introduces tests for broken encoding, ...). Although I think the final implementation may look very much like what you have here, it's just too much changing for one commit.

I'd much rather see this change as a series of incremental changes, each addressing more selective concerns, either as separate, independent PRs or as separate commits if very closely related.

For example: PR1 could implement simply switching from pywin32ctypes to a native ctypes implementation. That change could be reviewed, accepted, and released so that its effect could be proven and any unexpected regressions could be addressed before accepting more changes that depend on that change. Then PR2 could implement the sharding strategy (based on PR1). PR3 could fix the existing password checks and be independent of PR1 and PR3.

How would you feel about that approach?

jrobbins-LiveData · 2022-01-02T20:26:34Z

I like the idea of a phased set of pull requests. The only drawback is that I am inexperienced both with git and with tox etc on Linux. So I am getting bogged down in the mechanics.

I could really use some instructions as to how to install on Ubuntu so I can test.

Also, I don't know how to create new pull requests into my account without clobbering the existing cloned repo.

Other than these mechanical issues, I'd be happy to break it all apart into 3 PRs. Maybe you or someone else could undertake PR3 (and fix the typo with tearDown in the existing unit test for the Windows backend, since both of those issues are, as you say, unrelated to addressing the problem of accepting longer passwords on Windows?

[EDIT]
It appears that adding

import sys
if not sys.platform.startswith('win'):
    collect_ignore.append('keyring/backends/Windows/api.py')

to conftest.py would fix the tox/pytest crash on ubuntu etc. Is that the correct fix?

jaraco · 2022-02-12T19:18:07Z

I could really use some instructions as to how to install on Ubuntu so I can test.

For me, the easiest way to test on Ubuntu is to download Docker. I have a "tox-multipy" image I use to test Python applications. From the keyring source folder:

docker run -it -v $(pwd):/src jaraco/multipy-tox
# cd src
# $TOX_WORK_DIR='/tmp/tox'
# tox

Also, I don't know how to create new pull requests into my account without clobbering the existing cloned repo.

Oh, that must be really frustrating. I'd recommend to get a nice repo Gui like VSCode or Sourcetree or PyCharm or Sublime Merge (maybe in that order) and use that to manage your repo. It will make it easier to discover operations (like creating a branch and pushing it to your Github repo) and also visualize your changes. I'd also recommend reading through some of the tutorial resources on using Github, which talk about creating branches in Github.

For your current code, I'd recommend to create a branch for it.

git branch windows-take-2-ctypes

Then reset your main to match upstream and reset to that state.

git checkout main
git fetch upstream
git reset --hard upstream/main

Note, you may or may not have a remote called "upstream". If not, you may need to create one first.

git remote add upstream https://github.com/jaraco/keyring

I hope that helps. I would recommend spending some time getting familiar with branching and submitting PRs from branches as that will make your life immensely easier when it comes to contributing these changes.

Thanks for your patience and perseverance on this effort!

jaraco · 2023-11-12T21:45:53Z

Closing as stale. Feel free to revive this effort at any time.

ctypes-based, separate class, shard via Attributes

1f8bfb3

jrobbins-LiveData mentioned this pull request Jan 2, 2022

Windows backend UTF-8 detection doesn't work #554

Open

jaraco reviewed Jan 2, 2022

View reviewed changes

jaraco requested changes Jan 2, 2022

View reviewed changes

jaraco closed this Nov 12, 2023

jaraco mentioned this pull request Mar 5, 2024

get_credential(service, None) does not find anything after deleting one of two usernames #664

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"take-2": new Windows backend, ctypes-based, separate class which shards via Attributes #555

"take-2": new Windows backend, ctypes-based, separate class which shards via Attributes #555

jrobbins-LiveData commented Dec 5, 2021

jrobbins-LiveData commented Dec 22, 2021

jaraco Jan 2, 2022

jrobbins-LiveData Jan 2, 2022

jaraco Jan 2, 2022

jrobbins-LiveData Jan 2, 2022

jaraco Jan 2, 2022

jrobbins-LiveData Jan 2, 2022

jaraco Jan 2, 2022

jrobbins-LiveData Jan 2, 2022

jaraco left a comment

jrobbins-LiveData commented Jan 2, 2022 •

edited

Loading

jaraco commented Feb 12, 2022

jaraco commented Nov 12, 2023

		return credential


		def CredReadFromAttributes(Type, TargetName, Credential=None, encoding='utf-8'):

		Credential['CredentialBlob'] = accum.decode(encoding)
		Credential['CredentialBlobSize'] = len(Credential['CredentialBlob'])

"take-2": new Windows backend, ctypes-based, separate class which shards via Attributes #555

"take-2": new Windows backend, ctypes-based, separate class which shards via Attributes #555

Conversation

jrobbins-LiveData commented Dec 5, 2021

jrobbins-LiveData commented Dec 22, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jaraco left a comment

Choose a reason for hiding this comment

jrobbins-LiveData commented Jan 2, 2022 • edited Loading

jaraco commented Feb 12, 2022

jaraco commented Nov 12, 2023

jrobbins-LiveData commented Jan 2, 2022 •

edited

Loading