Store token hashes instead of tokens #1509

tgeoghegan · 2023-06-16T19:00:52Z

@jbr's comment in divviup/divviup-api#195 made me think that we should change how Janus stores aggregator and collector authentication tokens so that we only store a cryptographic hash of the token instead of the actual token. This is a bit tricky to do, because currently we use the aggregator_auth_tokens table to store both tokens that our leader would present to a helper (in which case we need to store the actual token) as well as tokens that our helper would use to authenticate requests from another leader (in which case we only need the hash). So we would have to enrich the representation of tokens in Janus to distinguish between tokens we present and tokens we accept.

The text was updated successfully, but these errors were encountered:

tgeoghegan · 2023-06-23T17:28:36Z

@jbr points out that if Janus helpers no longer store the actual aggregator and collector auth tokens, then Janus/Divvi Up need a story for allowing subscribers to recover from losing a collector auth token or allowing a peer aggregator to recover from losing an aggregator auth token, without forcing the rotation of a task, which could be very expensive if it has been deployed to many clients.

One solution strategy would be for Janus' aggregator API to expose some endpoints for rotating the auth tokens associated with a task. Then, if a subscriber loses access to a token, they can revoke the old one and mint a new one.

inahga · 2023-09-20T16:51:05Z

Some questions on context, apologies if I missed it in slack or other issues.

What's the threat model and/or improvements that justify token hashing? Not saying that is is an invalid thing to do, it's moreso me wanting to avoid making assumptions--it would also help us evaluate how much disruption or UX compromise is acceptable.

If we're going that far, should we take on the notion of "collector credentials" we discussed previously (divviup/divviup-api#487), and make it the subscriber's responsibility to generate collector auth tokens and then provide divvivup-api the HPKE config + collector auth token hash?

As long as the UX is such that the subscriber does not have to juggle two tokens. To that end perhaps we should adopt divviup/divviup-api#455?

QUESTION: is PATCH better?

For deletion, I don't think so. What would the caller call PATCH with to communicate that the token should be revoked? They can't do revoked_at because it would would just be ignored (we might be tempted to allow an expireable token, but with task expiration I'm not sure this makes sense).

Should revocation be immediate, or delayed by some interval to allow for in-flight requests to resolve?

IMO revocation should be immediate because the worst case scenario that justifies token revocation is that the token has been leaked and is being exploited.

What requests would be in-flight? I can only think of collector polling. It would indeed cut them off--but an operator doing a controlled token rotation will always create a new token, apply it to their environment (presumably a collector restart would be graceful), then delete the old one so I don't think this is a real risk.

the request could also include a token ID, which would allow divviup-api to ensure token IDs are consistent across leader and helper. That could help with debugging.

Yes!

Note that token rotation is complicated by the fact that Janus caches task data indefinitely. #238. Perhaps highly complicated.

Database questions: Why not have the token_id be the primary key and omit id? (I think you would need an index on token_id anyway, but I'm not sure off-hand what EXPLAIN ANALYZE would say about looking up tokens)

Presumably the token hash will be stored in plaintext in the database? I don't think there's any value in encrypting a hash. For any values that are stored in plaintext, I think the ord parameter becomes unnecessary? IIRC we only use it for the AAD during encryption, although I defer to other's judgement on that.

However, those solutions admit representations of illegal states (e.g., if the role in the task is helper, then we'd expect only token_hash to be non-null, but the implementation has to check that).

IMO I think we should have token and token_hash--overloading columns can be confusing. I believe we can use a check constraint to deny an invalid state, or worst case can use a trigger. I'd need to read the manual and play around a bit to be sure though. It's debatable whether we want to encode that kind of business logic into the database though.

tgeoghegan · 2023-09-20T19:44:03Z

This whole proposal is obsolete because DAP forbids rotating collector HPKE configs associated with a task, and I don't wish to revisit that decision.

tgeoghegan · 2023-09-20T19:59:06Z

Answering questions that are still relevant given that we're not doing rotation:

What's the threat model and/or improvements that justify token hashing?

Good point and well worth articulating. We don't want an attacker with read access to the Janus or divviup-api databases to be able to exfiltrate a credential and subsequently impersonate either the DAP leader or collector. If we store a hash of the auth token, then a request recipient can still authenticate messages. Naturally the credential still gets exposed to the recipient because it'll appear in HTTP headers, but those only exist in memory and not in a persistent database.

As long as the UX is such that the subscriber does not have to juggle two tokens. To that end perhaps we should adopt divviup/divviup-api#455?

Yeah, I believe we've agreed in Slack to make headway toward the reusable collector credential.

the request could also include a token ID, which would allow divviup-api to ensure token IDs are consistent across leader and helper. That could help with debugging.

Yes!

👍🏻

Database questions: Why not have the token_id be the primary key and omit id? (I think you would need an index on token_id anyway, but I'm not sure off-hand what EXPLAIN ANALYZE would say about looking up tokens)

I don't have a good source handy just now for why this is a good practice but if nothing else we should do it this way for consistency with other tables. If we want to revisit whether to get rid of autoincrementing integer primary keys in Janus tables, we should do that separately from this discussion.

Presumably the token hash will be stored in plaintext in the database? I don't think there's any value in encrypting a hash. For any values that are stored in plaintext, I think the ord parameter becomes unnecessary? IIRC we only use it for the AAD during encryption, although I defer to other's judgement on that.

Yes, the token hashes don't need to be stored encrypted. I'm not sure whether the ord parameter can be omitted.

IMO I think we should have token and token_hash--overloading columns can be confusing. I believe we can use a check constraint to deny an invalid state, or worst case can use a trigger. I'd need to read the manual and play around a bit to be sure though. It's debatable whether we want to encode that kind of business logic into the database though.

Good point, we can likely write a constraint that enforces this safely.

tgeoghegan · 2023-09-20T20:16:19Z

I think one important thing that's emerged from today's discussions in Slack is that we will not be supporting rotation of task-associated secrets like HPKE configs and auth tokens. That means that as argued in #1521, we should collapse the auth token tables into tasks.

When acting as an HTTP server, Janus only needs a hash of the aggregator authentication token to validate incoming requests, and can avoid storing the actual token. This commit adds `janus_core::auth_tokens::AuthenticationTokenHash` which will be used to implement that. Part of #1509

When acting as the helper, Janus only needs the aggregator auth token hash to validate requests. It never makes aggregation sub-protocol requests and thus doesn't need the actual token. In this commit we change the schema of the `tasks` table so that helpers may store an auth token hash instead of an auth token. These changes are reflected in `janus_aggregator_core::task::AggregatorTask`. Finally we make changes to the aggregator API to accommodate this: the helper reports the aggregator auth token it generates during task provisioning, but doesn't store that token and cannot subsequently provide it when responding to `GET /tasks/:task_id` requests. Fortunately that field in `TaskResp` was already optional, and divviup-api already ignored it except in the creation case. Part of #1509, 1524

The leader only needs the hash of the collector auth token in order to authenticate collection sub-protocol requests and thus we can avoid the risk of storing the unhashed token. This commit changes the `tasks` table to store a hash instead of the token, and makes the corresponding change to `janus_aggregator_core::task::AggregatorTask`. There are corresponding changes to the aggregator API: the leader must now be provided the collector auth token hash during task creation, and `TaskResp` now only contains a collector auth token hash. Part of #1509, #1524

The leader only needs the hash of the collector auth token in order to authenticate collection sub-protocol requests and thus we can avoid the risk of storing the unhashed token. This commit changes the `tasks` table to store a hash instead of the token, and makes the corresponding change to `janus_aggregator_core::task::AggregatorTask`. There are corresponding changes to the aggregator API: the leader must now be provided the collector auth token hash during task creation, and `TaskResp` now only contains a collector auth token hash. Additionally, we extend the aggregator API's config endpoint so that it now can report a list of features that will be used by divvi-up API to decide how to interact with it. The only such feature, which Janus unconditionally advertises, is `token-hash`, indicating that Janus stores token hashes instead of unhashed tokens where possible. Part of #1509, #1524

When acting as the helper, Janus only needs the aggregator auth token hash to validate requests. It never makes aggregation sub-protocol requests and thus doesn't need the actual token. In this commit we change the schema of the `tasks` table so that helpers may store an auth token hash instead of an auth token. These changes are reflected in `janus_aggregator_core::task::AggregatorTask`. Finally we make changes to the aggregator API to accommodate this: the helper reports the aggregator auth token it generates during task provisioning, but doesn't store that token and cannot subsequently provide it when responding to `GET /tasks/:task_id` requests. Fortunately that field in `TaskResp` was already optional, and divviup-api already ignored it except in the creation case. Part of #1509, 1524

The leader only needs the hash of the collector auth token in order to authenticate collection sub-protocol requests and thus we can avoid the risk of storing the unhashed token. This commit changes the `tasks` table to store a hash instead of the token, and makes the corresponding change to `janus_aggregator_core::task::AggregatorTask`. There are corresponding changes to the aggregator API: the leader must now be provided the collector auth token hash during task creation, and `TaskResp` now only contains a collector auth token hash. Additionally, we extend the aggregator API's config endpoint so that it now can report a list of features that will be used by divvi-up API to decide how to interact with it. The only such feature, which Janus unconditionally advertises, is `token-hash`, indicating that Janus stores token hashes instead of unhashed tokens where possible. Part of #1509, #1524

tgeoghegan · 2023-10-04T01:11:38Z

I believe we have all the Janus changes we want in place, but I'm going to keep this open until we prove that this works e2e with divviup-api, as the changes on that side have not landed yet.

tgeoghegan · 2023-10-05T23:06:09Z

We have divviup/divviup-api#542 to track divviup-api changes, and bugfixes to resolve impedance mismatches can follow in later issues/PRs. Closing this.

tgeoghegan mentioned this issue Jun 23, 2023

Update aggregator API for BYOHelper #1525

Merged

tgeoghegan mentioned this issue Jun 23, 2023

Collapse task_vdaf_verify_keys, task_collector_auth_tokens, task_aggregator_auth_tokens tables into tasks #1521

Closed

tgeoghegan self-assigned this Jun 30, 2023

tgeoghegan added this to the Change representation of tasks in datastore milestone Aug 2, 2023

tgeoghegan mentioned this issue Sep 13, 2023

Question: key-material from DB or environment variables? #1921

Closed

This comment was marked as outdated.

Sign in to view

tgeoghegan mentioned this issue Sep 25, 2023

Add AuthenticationTokenHash #1985

Merged

tgeoghegan mentioned this issue Oct 2, 2023

Only store aggregator auth token hash in helper #2062

Merged

tgeoghegan mentioned this issue Oct 2, 2023

only store collector auth token hash in leader #2064

Merged

tgeoghegan mentioned this issue Oct 3, 2023

Janus 0.6.0 blockers #2071

Closed

9 tasks

tgeoghegan closed this as completed Oct 4, 2023

tgeoghegan reopened this Oct 4, 2023

tgeoghegan closed this as completed Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store token hashes instead of tokens #1509

Store token hashes instead of tokens #1509

tgeoghegan commented Jun 16, 2023

tgeoghegan commented Jun 23, 2023

This comment was marked as outdated.

inahga commented Sep 20, 2023 •

edited

Loading

tgeoghegan commented Sep 20, 2023

tgeoghegan commented Sep 20, 2023

tgeoghegan commented Sep 20, 2023

tgeoghegan commented Oct 4, 2023

tgeoghegan commented Oct 5, 2023

Store token hashes instead of tokens #1509

Store token hashes instead of tokens #1509

Comments

tgeoghegan commented Jun 16, 2023

tgeoghegan commented Jun 23, 2023

This comment was marked as outdated.

inahga commented Sep 20, 2023 • edited Loading

tgeoghegan commented Sep 20, 2023

tgeoghegan commented Sep 20, 2023

tgeoghegan commented Sep 20, 2023

tgeoghegan commented Oct 4, 2023

tgeoghegan commented Oct 5, 2023

inahga commented Sep 20, 2023 •

edited

Loading