-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Classifier: Remove the faulty session id code #6499
Conversation
@@ -1,19 +1,12 @@ | |||
import hash from 'hash.js' | |||
|
|||
const storage = window.sessionStorage || window.localStorage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why the option for localStorage is here. This feature is supposed to be a session id, which refreshes just like a browser session. If researchers are using session ids to analyze anonymous users, then don't use localStorage as a backup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ✔️ Agreed, in this scenario a fallback is actually a bad option that leads to inconsistent/unpredictable behaviour. Removing localStorage is a good idea.
- ❓ I'm wondering if the rest of the session.js needs to be made more robust. I just realised a lot of the code in this file does things like
window.sessionStorage
andstorage.getItem()
, instead ofwindow?.sessionStorage
andstorage?.getItem()
, which may inexplicably crash the process if window or window.sessionStorage is ever unavailable. (Question for Shaun: when would such a scenario happen? Browser security settings? Incognito mode? "Do not track" enabled?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the classifier runs only in the browser, the chance of window
being undefined is lower than components of app-project, for example, and the entire classifier is also lazy loaded. I'm not sure if user browser settings could impact window
🤔 In a test env we've mocked window
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sessionStorage
will throw a SecurityException
if it violates browser privacy settings, so you should catch that.
https://developer.mozilla.org/en-US/docs/Web/API/Window/sessionStorage#exceptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to throw that error by setting the browser to block www.zooniverse.org
in an ad blocker or in cookie preferences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
@@ -24,18 +17,7 @@ const sessionUtils = { | |||
|
|||
getSessionID () { | |||
const stored = (storage.getItem('session_id')) ? JSON.parse(storage.getItem('session_id')) : this.generateSessionID() | |||
let { id, ttl } = stored | |||
|
|||
if (ttl < Date.now()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition was comparing a Date() string to a Date.now() number and therefore never met. It was copied directly from PFE's codebase, so I'll be doing the same cleanup over there.
Here's a little pen that, I think, shows that the original comparison is always false. |
generateSessionID () { | ||
const sha2 = hash.sha256() | ||
const id = sha2.update(`${Math.random() * 10000}${Date.now()}${Math.random() * 1000}`).digest('hex') | ||
const ttl = this.fiveMinutesFromNow() | ||
const stored = { id, ttl } | ||
const stored = { id } | ||
try { | ||
storage.setItem('session_id', JSON.stringify(stored)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[rambling][minor] Curiously, this means that we're storing session ID in an unnecessary "object layer", i.e. session_id = "{ id: \"1234\" }"
instead of session_id = "1234"
. Keeping it this way is fine by me, though. (And even if we did change it to the latter format, I wonder what'd happen if a user with an existing session ID refreshes their page after we push the changes. 🤔 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember that this stored object is also going to be read by PFE, so anything written to storage here must be readable by PFE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PFE might also expect the stored object to have a ttl
property, come to think of it. I assume PFE reads the session when you go to Talk, or your collections etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If PFE does generate a new session when you go to Talk, that would be a breaking change, so it's worth testing that case.
PFE code and FEM code running in the same tab, on the same origin, uses the same session ID in production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Review (Ongoing)
Package: lib-classifier
This PR removes some code in our session ID-generating function. The removed portion was never working, so no actual changes in website behaviour are expected.
- Previously:
- In theory, a session ID (for a browser tab/window) should be generated when the
page loadsfirst classification is submitted (see ❗ note below on FEM vs PFE), and should be re-generated after 5 minutes of inactivity. - In practice, a session ID (f.a.b.t.w.) is generated when the
page loadsfirst classification is submitted and remains throughout the lifetime of the browser window/tab.- (...unless localStorage somehow kicks in, then it's anybody's game. This is an extremely unlikely scenario, though.)
- In theory, a session ID (for a browser tab/window) should be generated when the
- With this PR:
- A session ID is generated when the
page loadsfirst classification is submitted and remains throughout the lifetime of the browser window/tab. - The README has been updated accordingly. (Note: there is a separate effort to inform project owners and researchers of the actual vs expected behaviour.)
- A session ID is generated when the
Code changes look good, I just need to run some tests to confirm.
Dev Notes
❗ Important note:
- On FEM, the session ID (f.a.b.t.w.) is ONLY generated on the Classify page, when a classification is submitted for the first time.
- On PFE, the session ID (f.a.b.t.w.) is generated as soon as you open any PFE page.
- For example, go to the Projects listing page - https://www.zooniverse.org/projects - and you should see a session ID generated.
- This behaviour should be fine if Session ID only matters to classifications, which I think they are.
Additional notes, to illustrate the issue that Delilah uncovered, with some examples:
- fiveMinutesFromNow() generates a Date.
- That date is stored as a String that looks like
"2024-12-03T15:59:48.883Z"
- Date.now() generates a number like
1733241013923
- So
"2024-12-03T15:59:48.883Z" < 1733241013923
=>false
. - In fact,
"(any string that's not a simple number that JS can auto-convert)" < ANY_NUMBER
=>false
, always. (Same is true for>
,<=
,>=
,==
btw)
Testing
Testing ongoing with macOS + Chrome, on localhost, on lib-classifier
- Run lib-classifier via
yarn dev
- Open Classifier in a new tab (Tab 1). Submit a Classification. Expect to see a new session ID in Chrome's session storage for this tab.
- Close Tab 1.
- Open Classifier in a new tab (Tab 2). Submit a yadda yadda, expect to see a new session ID, different from Tab 1.
- Without closing Tab 2, open Tab 3. Submit blah blah, expect to see a new session ID in Tab 3, while Tab 2's session ID should remain the same.
Status
Testing ongoing, should be done soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Review (Complete)
All tests pass - functionality checks out, and behaviour is working as expected. LGTM! 👍
Non-numeric strings are converted to |
Package
lib-classifier
Linked Issue and/or Talk Post
In response to discussion in: #6493 (comment)
Will link corresponding PFE PR here
Describe your changes
How to Review
{ id }
is added to your browser's session storage.id
is attached to the classification's metadata (this is all existing behavior, I'm simply removing the faulty code and unit tests).Checklist
PR Creator - Please cater the checklist to fit the review needed for your code changes.
PR Reviewer - Use the checklist during your review. Each point should be checkmarked or discussed before PR approval.
General
yarn panic && yarn bootstrap
ordocker-compose up --build
and FEM works as expectedGeneral UX
Example Staging Project: i-fancy-cats
Bug Fix