-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
💩 🔧 Enable suspending data collection for certain subgroups #1018
Conversation
09291fe
to
9b5131a
Compare
@JGreenlee @TeachMeTW once this change is on production, we can discuss if we want to continue with this hack on the server, or just go to the principled fix, in which case, most of these 💩 changes can be reverted. |
This revises the initial implementation in e-mission#1016 of a hack for e-mission/e-mission-docs#1104 The request in the issue was to implement the ability to suspend and resume data collection. The previous implementation checked for suspended subgroups in the auth method, and used the `request.path` to determine whether it was a `usercache/put` call or not This was needed because, by design, we only see the opcode in the auth module; the rest of the code works only with UUIDs The problem with that approach was that if the subgroup was suspended, we returned a UUID of None, which triggered a 403 error, which, in turn, caused the phone app to detect a failed call, so it did not delete the entries locally. e-mission#1016 (comment) e-mission#1016 (comment) We fix this by returning the non-random parts of the opcode (at least the subgroup) in a context, and perform the validation downstream (in the `putIntoCache` method). So if the subgroup is suspended, `putIntoCache` just returns early instead of storing the values. Note that this is also a bit of a hack, given that the way we handle the dynamic auth right now is to reset the global `auth_method`!! So we can't use a check for `auth_method == 'dynamic'` anywhere outside `resolve_auth` and have to use the existence of the dynamic config as the check for whether the auth method was dynamic or not Testing done: - TestWebserver.py passes - TestAuthSelection.py does not need any changes since we made sure to be backwards compatible When trying to push to a suspended subgroup, we get ``` 2025-02-07 11:10:26,127:DEBUG:6362165248:START POST /usercache/put 2025-02-07 11:10:26,127:DEBUG:6362165248:Called userCache.put 2025-02-07 11:10:26,189:DEBUG:6362165248:methodName = skip, returning <class 'emission.net.au th.skip.SkipMethod'> 2025-02-07 11:10:26,189:DEBUG:6362165248:Using the skip method to verify id token nrelop_dev- emulator-study_default_123 of length 37 2025-02-07 11:10:26,190:DEBUG:6362165248:subgroup default found in list ['test', 'default'] 2025-02-07 11:10:26,190:DEBUG:6362165248:retContext = {'subgroup': 'default', 'user_id': UUID ('eb4a7aae-f2d4-4480-ba85-c568a45591b5')} 2025-02-07 11:10:26,191:DEBUG:6362165248:user_uuid {'subgroup': 'default', 'user_id': UUID('e b4a7aae-f2d4-4480-ba85-c568a45591b5')} 2025-02-07 11:10:26,191:DEBUG:6362165248:suspended_subgroups=['default'] 2025-02-07 11:10:26,191:INFO:6362165248:Received put message for subgroup default in suspende d_subgroups=['default'], ignoring 2025-02-07 11:10:26,191:DEBUG:6362165248:END POST /usercache/put eb4a7aae-f2d4-4480-ba85-c568 ``` When trying to push to a non-suspended subgroup, we get ``` 2025-02-07 11:26:13,627:DEBUG:12901707776:START POST /usercache/put 2025-02-07 11:26:13,628:DEBUG:12901707776:Called userCache.put 2025-02-07 11:26:13,631:DEBUG:12901707776:methodName = skip, returning <class 'emission.net.auth.skip.SkipMethod'> 2025-02-07 11:26:13,631:DEBUG:12901707776:Using the skip method to verify id token nrelop_dev-emulator-study_test_123 of length 34 2025-02-07 11:26:13,634:DEBUG:12901707776:subgroup test found in list ['test', 'default'] 2025-02-07 11:26:13,634:DEBUG:12901707776:retContext = {'subgroup': 'test', 'user_id': UUID('feb70456-abf4-444b-8848-1515fc3470cf')} 2025-02-07 11:26:13,634:DEBUG:12901707776:user_uuid {'subgroup': 'test', 'user_id': UUID('feb70456-abf4-444b-8848-1515fc3470cf')} 2025-02-07 11:26:13,634:DEBUG:12901707776:suspended_subgroups=['default'] 2025-02-07 11:26:13,644:DEBUG:12901707776:Updated result for user = feb70456-abf4-444b-8848-1515fc3470cf, key = stats/client_time, write_ts = 1738956332.243175 = {'n': 1, 'nModified': 0, 'upserted': ObjectId('67a65e5546756f8244f024f6'), 'ok': 1.0, 'updatedExisting': False} 2025-02-07 11:26:13,646:DEBUG:12901707776:Updated result for user = feb70456-abf4-444b-8848-1515fc3470cf, key = stats/client_time, write_ts = 1738956332.246808 = {'n': 1, 'nModified': 0, 'upserted': ObjectId('67a65e5546756f8244f024f8'), 'ok': 1.0, 'updatedExisting': False} 2025-02-07 11:26:13,857:DEBUG:12901707776:Updated result for user = feb70456-abf4-444b-8848-1515fc3470cf, key = statemachine/transition, write_ts = 1738956373.598038 = {'n': 1, 'nModified': 0, 'upserted': ObjectId('67a65e5546756f8244f025cd'), 'ok': 1.0, 'updatedExisting': False} 2025-02-07 11:26:13,858:DEBUG:12901707776:Updated result for user = feb70456-abf4-444b-8848-1515fc3470cf, key = background/battery, write_ts = 1738956373.5991821 = {'n': 1, 'nModified': 0, 'upserted': ObjectId('67a65e5546756f8244f025cf'), 'ok': 1.0, 'updatedExisting': False} 2025-02-07 11:26:13,859:DEBUG:12901707776:Updated result for user = feb70456-abf4-444b-8848-1515fc3470cf, key = stats/client_nav_event, write_ts = 1738956373.607026 = {'n': 1, 'nModified': 0, 'upserted': ObjectId('67a65e5546756f8244f025d1'), 'ok': 1.0, 'updatedExisting': False} 2025-02-07 11:26:13,860:DEBUG:12901707776:END POST /usercache/put feb70456-abf4-444b-8848-1515fc3470cf 0.2325270175933838 ```
9b5131a
to
788ffaf
Compare
On the server, I now see
And on the phone, I see a "successful" push
And no cached entries when I next check.
|
I'm going to unsuspend |
This reverts commit 5f54292. So that we can re-verify the common case before deploying to production e-mission/e-mission-server#1018 (comment)
After unsuspending
But it succeeded after I returned home
Then today morning, we went up from 2 trips to 5 trips
On the server,
Moving this to production on nrel commute and open access for now, will move to production on the other deployments tonight. |
verified that this is working properly with my personal phone
I can't verify anything further downstream because the pipeline is stuck but this is working properly |
This revises the initial implementation in
#1016
of a hack for e-mission/e-mission-docs#1104
The request in the issue was to implement the ability to suspend and resume data collection.
The previous implementation checked for suspended subgroups in the auth method, and used the
request.path
to determine whether it was ausercache/put
call or not. This was needed because, by design, we only see the opcode in the auth module; the rest of the code works only with UUIDsThe problem with that approach was that if the subgroup was suspended, we returned a UUID of None, which triggered a 403 error, which, in turn, caused the phone app to detect a failed call, so it did not delete the entries locally.
#1016 (comment)
#1016 (comment)
We fix this by returning the non-random parts of the opcode (at least the subgroup) in a context, and perform the validation downstream (in the
putIntoCache
method). So if the subgroup is suspended,putIntoCache
just returns early instead of storing the values.Note that this is also a bit of a hack, given that the way we handle the dynamic auth right now is to reset the global
auth_method
!! So we can't use a check forauth_method == 'dynamic'
anywhere outsideresolve_auth
and haveto use the existence of the dynamic config as the check for whether the auth method was dynamic or not
Testing done:
When trying to push to a suspended subgroup, we get
When trying to push to a non-suspended subgroup, we get