Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store token hashes instead of tokens #1509

Closed
Tracked by #2071
tgeoghegan opened this issue Jun 16, 2023 · 8 comments
Closed
Tracked by #2071

Store token hashes instead of tokens #1509

tgeoghegan opened this issue Jun 16, 2023 · 8 comments
Assignees

Comments

@tgeoghegan
Copy link
Contributor

@jbr's comment in divviup/divviup-api#195 made me think that we should change how Janus stores aggregator and collector authentication tokens so that we only store a cryptographic hash of the token instead of the actual token. This is a bit tricky to do, because currently we use the aggregator_auth_tokens table to store both tokens that our leader would present to a helper (in which case we need to store the actual token) as well as tokens that our helper would use to authenticate requests from another leader (in which case we only need the hash). So we would have to enrich the representation of tokens in Janus to distinguish between tokens we present and tokens we accept.

@tgeoghegan
Copy link
Contributor Author

@jbr points out that if Janus helpers no longer store the actual aggregator and collector auth tokens, then Janus/Divvi Up need a story for allowing subscribers to recover from losing a collector auth token or allowing a peer aggregator to recover from losing an aggregator auth token, without forcing the rotation of a task, which could be very expensive if it has been deployed to many clients.

One solution strategy would be for Janus' aggregator API to expose some endpoints for rotating the auth tokens associated with a task. Then, if a subscriber loses access to a token, they can revoke the old one and mint a new one.

@tgeoghegan

This comment was marked as outdated.

@inahga
Copy link
Contributor

inahga commented Sep 20, 2023

Some questions on context, apologies if I missed it in slack or other issues.

What's the threat model and/or improvements that justify token hashing? Not saying that is is an invalid thing to do, it's moreso me wanting to avoid making assumptions--it would also help us evaluate how much disruption or UX compromise is acceptable.

If we're going that far, should we take on the notion of "collector credentials" we discussed previously (divviup/divviup-api#487), and make it the subscriber's responsibility to generate collector auth tokens and then provide divvivup-api the HPKE config + collector auth token hash?

As long as the UX is such that the subscriber does not have to juggle two tokens. To that end perhaps we should adopt divviup/divviup-api#455?

QUESTION: is PATCH better?

For deletion, I don't think so. What would the caller call PATCH with to communicate that the token should be revoked? They can't do revoked_at because it would would just be ignored (we might be tempted to allow an expireable token, but with task expiration I'm not sure this makes sense).

Should revocation be immediate, or delayed by some interval to allow for in-flight requests to resolve?

IMO revocation should be immediate because the worst case scenario that justifies token revocation is that the token has been leaked and is being exploited.

What requests would be in-flight? I can only think of collector polling. It would indeed cut them off--but an operator doing a controlled token rotation will always create a new token, apply it to their environment (presumably a collector restart would be graceful), then delete the old one so I don't think this is a real risk.

the request could also include a token ID, which would allow divviup-api to ensure token IDs are consistent across leader and helper. That could help with debugging.

Yes!


Note that token rotation is complicated by the fact that Janus caches task data indefinitely. #238. Perhaps highly complicated.


Database questions: Why not have the token_id be the primary key and omit id? (I think you would need an index on token_id anyway, but I'm not sure off-hand what EXPLAIN ANALYZE would say about looking up tokens)

Presumably the token hash will be stored in plaintext in the database? I don't think there's any value in encrypting a hash. For any values that are stored in plaintext, I think the ord parameter becomes unnecessary? IIRC we only use it for the AAD during encryption, although I defer to other's judgement on that.

However, those solutions admit representations of illegal states (e.g., if the role in the task is helper, then we'd expect only token_hash to be non-null, but the implementation has to check that).

IMO I think we should have token and token_hash--overloading columns can be confusing. I believe we can use a check constraint to deny an invalid state, or worst case can use a trigger. I'd need to read the manual and play around a bit to be sure though. It's debatable whether we want to encode that kind of business logic into the database though.

@tgeoghegan
Copy link
Contributor Author

This whole proposal is obsolete because DAP forbids rotating collector HPKE configs associated with a task, and I don't wish to revisit that decision.

@tgeoghegan
Copy link
Contributor Author

Answering questions that are still relevant given that we're not doing rotation:

What's the threat model and/or improvements that justify token hashing?

Good point and well worth articulating. We don't want an attacker with read access to the Janus or divviup-api databases to be able to exfiltrate a credential and subsequently impersonate either the DAP leader or collector. If we store a hash of the auth token, then a request recipient can still authenticate messages. Naturally the credential still gets exposed to the recipient because it'll appear in HTTP headers, but those only exist in memory and not in a persistent database.

As long as the UX is such that the subscriber does not have to juggle two tokens. To that end perhaps we should adopt divviup/divviup-api#455?

Yeah, I believe we've agreed in Slack to make headway toward the reusable collector credential.

the request could also include a token ID, which would allow divviup-api to ensure token IDs are consistent across leader and helper. That could help with debugging.

Yes!

👍🏻

Database questions: Why not have the token_id be the primary key and omit id? (I think you would need an index on token_id anyway, but I'm not sure off-hand what EXPLAIN ANALYZE would say about looking up tokens)

I don't have a good source handy just now for why this is a good practice but if nothing else we should do it this way for consistency with other tables. If we want to revisit whether to get rid of autoincrementing integer primary keys in Janus tables, we should do that separately from this discussion.

Presumably the token hash will be stored in plaintext in the database? I don't think there's any value in encrypting a hash. For any values that are stored in plaintext, I think the ord parameter becomes unnecessary? IIRC we only use it for the AAD during encryption, although I defer to other's judgement on that.

Yes, the token hashes don't need to be stored encrypted. I'm not sure whether the ord parameter can be omitted.

IMO I think we should have token and token_hash--overloading columns can be confusing. I believe we can use a check constraint to deny an invalid state, or worst case can use a trigger. I'd need to read the manual and play around a bit to be sure though. It's debatable whether we want to encode that kind of business logic into the database though.

Good point, we can likely write a constraint that enforces this safely.

@tgeoghegan
Copy link
Contributor Author

I think one important thing that's emerged from today's discussions in Slack is that we will not be supporting rotation of task-associated secrets like HPKE configs and auth tokens. That means that as argued in #1521, we should collapse the auth token tables into tasks.

tgeoghegan added a commit that referenced this issue Sep 25, 2023
When acting as an HTTP server, Janus only needs a hash of the aggregator
authentication token to validate incoming requests, and can avoid
storing the actual token. This commit adds
`janus_core::auth_tokens::AuthenticationTokenHash` which will be used to
implement that.

Part of #1509
tgeoghegan added a commit that referenced this issue Sep 25, 2023
When acting as an HTTP server, Janus only needs a hash of the aggregator
authentication token to validate incoming requests, and can avoid
storing the actual token. This commit adds
`janus_core::auth_tokens::AuthenticationTokenHash` which will be used to
implement that.

Part of #1509
tgeoghegan added a commit that referenced this issue Sep 25, 2023
When acting as an HTTP server, Janus only needs a hash of the aggregator
authentication token to validate incoming requests, and can avoid
storing the actual token. This commit adds
`janus_core::auth_tokens::AuthenticationTokenHash` which will be used to
implement that.

Part of #1509
tgeoghegan added a commit that referenced this issue Oct 2, 2023
When acting as the helper, Janus only needs the aggregator auth token
hash to validate requests. It never makes aggregation sub-protocol
requests and thus doesn't need the actual token.

In this commit we change the schema of the `tasks` table so that helpers
may store an auth token hash instead of an auth token. These changes are
reflected in `janus_aggregator_core::task::AggregatorTask`. Finally we
make changes to the aggregator API to accommodate this: the helper
reports the aggregator auth token it generates during task provisioning,
but doesn't store that token and cannot subsequently provide it when
responding to `GET /tasks/:task_id` requests. Fortunately that field in
`TaskResp` was already optional, and divviup-api already ignored it
except in the creation case.

Part of #1509, 1524
tgeoghegan added a commit that referenced this issue Oct 2, 2023
When acting as the helper, Janus only needs the aggregator auth token
hash to validate requests. It never makes aggregation sub-protocol
requests and thus doesn't need the actual token.

In this commit we change the schema of the `tasks` table so that helpers
may store an auth token hash instead of an auth token. These changes are
reflected in `janus_aggregator_core::task::AggregatorTask`. Finally we
make changes to the aggregator API to accommodate this: the helper
reports the aggregator auth token it generates during task provisioning,
but doesn't store that token and cannot subsequently provide it when
responding to `GET /tasks/:task_id` requests. Fortunately that field in
`TaskResp` was already optional, and divviup-api already ignored it
except in the creation case.

Part of #1509, 1524
tgeoghegan added a commit that referenced this issue Oct 2, 2023
When acting as the helper, Janus only needs the aggregator auth token
hash to validate requests. It never makes aggregation sub-protocol
requests and thus doesn't need the actual token.

In this commit we change the schema of the `tasks` table so that helpers
may store an auth token hash instead of an auth token. These changes are
reflected in `janus_aggregator_core::task::AggregatorTask`. Finally we
make changes to the aggregator API to accommodate this: the helper
reports the aggregator auth token it generates during task provisioning,
but doesn't store that token and cannot subsequently provide it when
responding to `GET /tasks/:task_id` requests. Fortunately that field in
`TaskResp` was already optional, and divviup-api already ignored it
except in the creation case.

Part of #1509, 1524
tgeoghegan added a commit that referenced this issue Oct 2, 2023
When acting as the helper, Janus only needs the aggregator auth token
hash to validate requests. It never makes aggregation sub-protocol
requests and thus doesn't need the actual token.

In this commit we change the schema of the `tasks` table so that helpers
may store an auth token hash instead of an auth token. These changes are
reflected in `janus_aggregator_core::task::AggregatorTask`. Finally we
make changes to the aggregator API to accommodate this: the helper
reports the aggregator auth token it generates during task provisioning,
but doesn't store that token and cannot subsequently provide it when
responding to `GET /tasks/:task_id` requests. Fortunately that field in
`TaskResp` was already optional, and divviup-api already ignored it
except in the creation case.

Part of #1509, 1524
tgeoghegan added a commit that referenced this issue Oct 2, 2023
The leader only needs the hash of the collector auth token in order to
authenticate collection sub-protocol requests and thus we can avoid the
risk of storing the unhashed token.

This commit changes the `tasks` table to store a hash instead of the
token, and makes the corresponding change to
`janus_aggregator_core::task::AggregatorTask`. There are corresponding
changes to the aggregator API: the leader must now be provided the
collector auth token hash during task creation, and `TaskResp` now only
contains a collector auth token hash.

Part of #1509, #1524
tgeoghegan added a commit that referenced this issue Oct 2, 2023
The leader only needs the hash of the collector auth token in order to
authenticate collection sub-protocol requests and thus we can avoid the
risk of storing the unhashed token.

This commit changes the `tasks` table to store a hash instead of the
token, and makes the corresponding change to
`janus_aggregator_core::task::AggregatorTask`. There are corresponding
changes to the aggregator API: the leader must now be provided the
collector auth token hash during task creation, and `TaskResp` now only
contains a collector auth token hash.

Additionally, we extend the aggregator API's config endpoint so that it
now can report a list of features that will be used by divvi-up API to
decide how to interact with it. The only such feature, which Janus
unconditionally advertises, is `token-hash`, indicating that Janus
stores token hashes instead of unhashed tokens where possible.

Part of #1509, #1524
tgeoghegan added a commit that referenced this issue Oct 3, 2023
When acting as the helper, Janus only needs the aggregator auth token
hash to validate requests. It never makes aggregation sub-protocol
requests and thus doesn't need the actual token.

In this commit we change the schema of the `tasks` table so that helpers
may store an auth token hash instead of an auth token. These changes are
reflected in `janus_aggregator_core::task::AggregatorTask`. Finally we
make changes to the aggregator API to accommodate this: the helper
reports the aggregator auth token it generates during task provisioning,
but doesn't store that token and cannot subsequently provide it when
responding to `GET /tasks/:task_id` requests. Fortunately that field in
`TaskResp` was already optional, and divviup-api already ignored it
except in the creation case.

Part of #1509, 1524
tgeoghegan added a commit that referenced this issue Oct 3, 2023
When acting as the helper, Janus only needs the aggregator auth token
hash to validate requests. It never makes aggregation sub-protocol
requests and thus doesn't need the actual token.

In this commit we change the schema of the `tasks` table so that helpers
may store an auth token hash instead of an auth token. These changes are
reflected in `janus_aggregator_core::task::AggregatorTask`. Finally we
make changes to the aggregator API to accommodate this: the helper
reports the aggregator auth token it generates during task provisioning,
but doesn't store that token and cannot subsequently provide it when
responding to `GET /tasks/:task_id` requests. Fortunately that field in
`TaskResp` was already optional, and divviup-api already ignored it
except in the creation case.

Part of #1509, 1524
tgeoghegan added a commit that referenced this issue Oct 3, 2023
The leader only needs the hash of the collector auth token in order to
authenticate collection sub-protocol requests and thus we can avoid the
risk of storing the unhashed token.

This commit changes the `tasks` table to store a hash instead of the
token, and makes the corresponding change to
`janus_aggregator_core::task::AggregatorTask`. There are corresponding
changes to the aggregator API: the leader must now be provided the
collector auth token hash during task creation, and `TaskResp` now only
contains a collector auth token hash.

Additionally, we extend the aggregator API's config endpoint so that it
now can report a list of features that will be used by divvi-up API to
decide how to interact with it. The only such feature, which Janus
unconditionally advertises, is `token-hash`, indicating that Janus
stores token hashes instead of unhashed tokens where possible.

Part of #1509, #1524
@tgeoghegan tgeoghegan mentioned this issue Oct 3, 2023
9 tasks
tgeoghegan added a commit that referenced this issue Oct 4, 2023
The leader only needs the hash of the collector auth token in order to
authenticate collection sub-protocol requests and thus we can avoid the
risk of storing the unhashed token.

This commit changes the `tasks` table to store a hash instead of the
token, and makes the corresponding change to
`janus_aggregator_core::task::AggregatorTask`. There are corresponding
changes to the aggregator API: the leader must now be provided the
collector auth token hash during task creation, and `TaskResp` now only
contains a collector auth token hash.

Additionally, we extend the aggregator API's config endpoint so that it
now can report a list of features that will be used by divvi-up API to
decide how to interact with it. The only such feature, which Janus
unconditionally advertises, is `token-hash`, indicating that Janus
stores token hashes instead of unhashed tokens where possible.

Part of #1509, #1524
@tgeoghegan tgeoghegan reopened this Oct 4, 2023
@tgeoghegan
Copy link
Contributor Author

I believe we have all the Janus changes we want in place, but I'm going to keep this open until we prove that this works e2e with divviup-api, as the changes on that side have not landed yet.

@tgeoghegan
Copy link
Contributor Author

We have divviup/divviup-api#542 to track divviup-api changes, and bugfixes to resolve impedance mismatches can follow in later issues/PRs. Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants