Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a "bad record handler" for SnssFile #27

Open
youk opened this issue Dec 9, 2024 · 8 comments
Open

Implement a "bad record handler" for SnssFile #27

youk opened this issue Dec 9, 2024 · 8 comments
Assignees
Labels
enhancement New feature or request

Comments

@youk
Copy link

youk commented Dec 9, 2024

Is reading of session files supposed to work at all? ccl_chromium_snss2.py just fails:

ValueError: 131 is not a valid SessionRestoreIdType

Microsoft Edge 131.0.2903.86

@AlexC-CCL
Copy link
Contributor

Hi there, I haven't tested with the most recent builds of Chrome but it was working with the data I had available when it then the code was published, yes.

The error message is correct, 131 is not a valid SessionRestoreIdType (see Chromium source-code here: https://source.chromium.org/chromium/chromium/src/+/main:components/sessions/core/session_service_commands.cc;l=29), so it might be that something has changed in the format since this was built.

If you are able to share the data I could take a look at it, otherwise I'll try it with version 131 data, but unless it's a fairly trivial change, I won't be able to dig in much before the end of the year due to other comitments, sorry.

@AlexC-CCL
Copy link
Contributor

It doesn's appear that anything has changed in the core serialization, as far as Chromium is concerned: https://source.chromium.org/chromium/chromium/src/+/main:components/sessions/core/serialized_navigation_entry.cc;bpv=1

The format is based around a pickle, which is prefixed with a length, so it seems less likely that the record buffer is being read incorrectly, so my initial guess is that this might be either: an Edge specific customization; or a corrupt file.

@AlexC-CCL
Copy link
Contributor

AlexC-CCL commented Dec 9, 2024

Update: I can confirm that I'm getting a similar error with Edge but not Chrome, so it's looking like a Edge customization to the format.

@AlexC-CCL
Copy link
Contributor

AlexC-CCL commented Dec 9, 2024

I have confirmed that Edge is using some custom command IDs. I found that 132 is also being used in my data. Fixed in a8a1924 - let me know if this resolves your issue please.

@youk
Copy link
Author

youk commented Dec 9, 2024

This helped, thanks!

However, with one session file I am getting ValueError: 44 is not a valid SessionRestoreIdType now. That's a large file with a lot of entries. I could try to pin down the problematic entry, but maybe this can be fixed in the same way?

Would it be possible to change the processing so that it doesn't fail when unknown SessionRestoreIdType is found? Just print the message to stderr and continue.

@AlexC-CCL
Copy link
Contributor

I'll look at implementing a "bad record handler" like I have in a few other places so that the calling code gets the full context of the reason and location of the bad data. I like an exception by default, because people don't generally report issues / I don't notice changes to the underlying format if it doesn't crash.

In the meantime you could add the following to the SnssFile class to work around it:

    def iter_session_commands2(self):
        self.reset()
        while True:
            try:
                command = self._get_next_session_command()
            except ValueError as ex:
                print(ex, file=sys.stderr)
                continue
            yield command

(And then change any calls to the original version of the function to this version, or just overwrite the original function I suppose).

@youk
Copy link
Author

youk commented Dec 9, 2024

Thank you. I agree that default behavior should be throwing an exception.

@AlexC-CCL AlexC-CCL changed the title SnssFileType.Session isn't functional Implement a "bad record handler" for SnssFile Dec 9, 2024
@AlexC-CCL
Copy link
Contributor

I've renamed the issue to represent the current plan of action to resolve this more long-term. I'll leave the issue open on that basis.

@AlexC-CCL AlexC-CCL added the enhancement New feature or request label Dec 9, 2024
@AlexC-CCL AlexC-CCL self-assigned this Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants