Skip to content

Commit

Permalink
Release version v1.0.3
Browse files Browse the repository at this point in the history
  - Kokoro release: Cut off new release v1.0.3.
  - internal: Updated CHANGELOG.md for 1.0.3
  - doc: Update documentation for PRIVACY_BUDGET_EXHAUSTED er...
  - doc: Add note about multiple NUMA nodes instances
  - doc: Add sizing guidance in docs
  - Correct api.md url in error message referencing api.md fi...
  - internal: Update dependencies

GitOrigin-RevId: 0693f705740147f768106b7b68d8672ecb576178
  • Loading branch information
Privacy Sandbox Team authored and hostirosti committed Sep 8, 2023
1 parent 41eb45a commit 7315ef8
Show file tree
Hide file tree
Showing 12 changed files with 198 additions and 15 deletions.
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# Changelog

## [1.0.3](https://github.com/privacysandbox/aggregation-service/compare/v1.0.2...v1.0.3) (2023-09-05)

### Changes

- Updated build container dependencies.
- Updated documentation for PRIVACY_BUDGET_EXHAUSTED error code.
- Updated aws-aggregation-service.md with a note about multiple NUMA nodes instances.
- Added sizing-guidance.md for sizing guidance.
- Corrected api.md url in error message referencing api.md file.

## [1.0.2](https://github.com/privacysandbox/aggregation-service/compare/v1.0.1...v1.0.2) (2023-08-03)

### Changes
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.0.2
1.0.3
14 changes: 7 additions & 7 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -315,7 +315,7 @@ load("@io_bazel_rules_docker//container:container.bzl", "container_pull")
# Distroless image for running Java.
container_pull(
name = "java_base",
# Using SHA-256 for reproducibility. The tag is latest-amd64. Latest as of 2023-08-03.
# Using SHA-256 for reproducibility. The tag is latest-amd64. Latest as of 2023-09-05.
digest = "sha256:052076466984fd56979c15a9c3b7433262b0ad9aae55bc0c53d1da8ffdd829c3",
registry = "gcr.io",
repository = "distroless/java17-debian11",
Expand Down Expand Up @@ -344,11 +344,11 @@ container_pull(
# Pulls AWS Otel Collector
container_pull(
name = "aws_otel_collector",
# latest as of 2023-08-03.
digest = "sha256:4703de9f02fdb23602b2e9961aeb151e476775ce2ac38a401b4864c3f979644d",
# latest as of 2023-09-05.
digest = "sha256:2a6183f63e637b940584e8ebf5335bd9a2581ca16ee400e2e74b7b488825adb4",
registry = "public.ecr.aws",
repository = "aws-observability/aws-otel-collector",
tag = "v0.31.0",
tag = "v0.32.0",
)

#############
Expand Down Expand Up @@ -479,11 +479,11 @@ http_archive(
# Needed for reproducibly building AL2 binaries (e.g. //cc/aws/proxy)
container_pull(
name = "amazonlinux_2",
# Latest as of 2023-08-03.
digest = "sha256:8493322bcbf25417bb2acf5de53302811c30558965fb9c91bd8dfe3a9db7a06e",
# Latest as of 2023-09-05.
digest = "sha256:993d82940dba5370065dd5afb99fab56cdaf9f7b88800e88ddbd622678a6d3ea",
registry = "index.docker.io",
repository = "amazonlinux",
tag = "2.0.20230719.0",
tag = "2.0.20230822.0",
)

################################################################################
Expand Down
4 changes: 2 additions & 2 deletions build-scripts/aws/build-container/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# bookworm-slim stable with latest security updates 2023-08-03
FROM debian@sha256:5bbfcb9f36a506f9c9c2fb53205f15f6e9d1f0e032939378ddc049a2d26d651e
# bookworm-slim stable with latest security updates 2023-09-05
FROM debian@sha256:a60c0c42bc6bdc09d91cd57067fcc952b68ad62de651c4cf939c27c9f007d1c5

RUN \
# This makes add-apt-repository available.
Expand Down
Binary file added docs/assets/instance-type-recommendation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 7 additions & 0 deletions docs/aws-aggregation-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,13 @@ Make the following adjustments in the `<repository_root>/terraform/aws/environme
- alarm_notification_email: Email to receive alarm notifications. Requires confirmation
subscription through sign up email sent to this address.

Note: If you want to use an instance type other than the default one specified in the
configuration, we recommend using an instance type with single NUMA node. Memory and CPUs for
the enclave must be from the [same NUMA node](https://docs.kernel.org/virt/ne_overview.html);
however, a single NUMA node on AWS EC2 has a maximum of 48 cores. Please refer to
[sizing guidance](/docs/sizing-guidance.md) for instance type recommendations. For AWS instance
specific questions, please contact AWS support.

1. **Skip this step if you use our prebuilt AMI and Lambda jars**

If you [self-build your AMI and jars](/build-scripts/aws/README.md), you need to copy the
Expand Down
165 changes: 165 additions & 0 deletions docs/sizing-guidance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
## Background

Ad techs use the aggregation service to generate summary reports from aggregatable reports (see the
[workflow](https://github.com/WICG/attribution-reporting-api/blob/main/AGGREGATION_SERVICE_TEE.md#aggregation-workflow)).
The aggregation service can process jobs of different input sizes (reports and pre-declared
aggregation buckets). Each report can include a different number of events. Each event is a
key-value pair in an aggregatable report's encrypted payload. A pre-declared aggregation bucket is
an entry (also referred to as domain key in this document) in the pre-declared buckets file that is
provided as an input in the aggregation job (see
[createJob API documentation](https://github.com/privacysandbox/aggregation-service/blob/main/docs/api.md#payload)).
Reports used in generating this guidance, each have a payload with 10 events.

## Which Cloud compute instance type to use for aggregation? -

The table below provides cloud instance type guidelines for cost-efficient processing of aggregation
jobs with different sizes of reports and domain. The recommendation is based on
[job processing time](#how-long-does-an-aggregation-job-take--) and estimated EC2 instance cost from
[Amazon EC2 Pricing](https://aws.amazon.com/ec2/pricing/on-demand/). For example, if your individual
jobs are below the 50K reports/50 million domain size, m5.2xlarge EC2 instance type is recommended.

Note that this is a general guideline, and ad techs may need to use a different instance type
depending on their specific needs. For example, if jobs need a tighter time-bound, one may need to
provision larger compute capacity instance types.

The table considers the following AWS EC2 instances types: m5.2xlarge (default), m5.4xlarge,
m5.8xlarge, m5.12xlarge, r5.2xlarge, r5.4xlarge, and r5.8xlarge.

For a given aggregation job, the rows (R) correspond to the input reports. The columns (D)
correspond to the domain keys (in millions).

![instance-type-recommendation](assets/instance-type-recommendation.png)

The instance types here are subject to change.

## How long does an aggregation job take? -

The aggregation job completion time is the time an instance picks up the job from the job queue to
when the results were written to S3. It depends on a number of factors like the size of the job
inputs, the service instance compute capacity, the auto scaling size, and more.

The following tables show the approximate processing time bounds of jobs of varying size after a job
is picked up for processing by an aggregation worker. Each report includes 10 events and each event
is a key-value pair in the report's payload. The domain key is a bucket/key in the pre-declared
buckets file that is provided as an input parameter to the aggregation job.

The processing time of the jobs are split into different brackets. For example, the "00 hr 02 min"
bracket indicates that the job ran for more than 1 minute but less than or equal to 2 minutes. The
"n/a" bracket indicates that the instance was not able to process the job, and a higher capacity
instance is needed.

The processing time values here are subject to change.

| Running time buckets |
| -------------------- |
| 00 hr 01 min |
| 00 hr 02 min |
| 00 hr 10 min |
| 00 hr 20 min |
| 01 hr 00 min |
| 02 hr 00 min |
| 03 hr 00 min |
| 04 hr 00 min |
| 06 hr 00 min |
| 10 hr 00 min |

In the following table, the rows (R) correspond to the input reports. The columns (D) correspond to
the domain keys (in millions).

- m5.2xlarge

| R / D | 1M | 50M | 70M | 80M | 100M | 200M |
| ----- | ------------ | ------------ | ------------ | --- | ---- | ---- |
| 10K | 00 hr 02 min | 00 hr 20 min | 00 hr 20 min | n/a | n/a | n/a |
| 50K | 00 hr 02 min | 00 hr 20 min | 00 hr 20 min | n/a | n/a | n/a |
| 100K | 00 hr 02 min | 00 hr 20 min | 00 hr 20 min | n/a | n/a | n/a |
| 1M | 00 hr 02 min | 00 hr 20 min | 00 hr 20 min | n/a | n/a | n/a |
| 10M | 00 hr 20 min | 01 hr 00 min | 01 hr 00 min | n/a | n/a | n/a |
| 100M | 03 hr 00 min | 04 hr 00 min | n/a | n/a | n/a | n/a |
| 200M | 06 hr 00 min | n/a | n/a | n/a | n/a | n/a |
| 300M | 10 hr 00 min | n/a | n/a | n/a | n/a | n/a |
| 400M | n/a | n/a | n/a | n/a | n/a | n/a |

- m5.4xlarge

| R / D | 1M | 50M | 70M | 80M | 100M | 200M |
| ----- | ------------ | ------------ | ------------ | ------------ | ------------ | ---- |
| 10K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | n/a |
| 50K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | n/a |
| 100K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | n/a |
| 1M | 00 hr 02 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | n/a |
| 10M | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | n/a | n/a |
| 100M | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | n/a | n/a |
| 200M | 02 hr 00 min | 03 hr 00 min | n/a | n/a | n/a | n/a |
| 300M | 04 hr 00 min | n/a | n/a | n/a | n/a | n/a |
| 400M | 06 hr 00 min | n/a | n/a | n/a | n/a | n/a |

- m5.8xlarge

| R / D | 1M | 50M | 70M | 80M | 100M | 200M |
| ----- | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 10K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 50K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 100K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 1M | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 10M | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 100M | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 02 hr 00 min |
| 200M | 01 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | n/a |
| 300M | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | n/a |
| 400M | 02 hr 00 min | 03 hr 00 min | 03 hr 00 min | 03 hr 00 min | 03 hr 00 min | n/a |

- m5.12xlarge

| R / D | 1M | 50M | 70M | 80M | 100M | 200M |
| ----- | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 10K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 50K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 100K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 1M | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 10M | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 100M | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min |
| 200M | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 02 hr 00 min |
| 300M | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min |
| 400M | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 03 hr 00 min |

- r5.2xlarge

| R / D | 1M | 50M | 70M | 80M | 100M | 200M |
| ----- | ------------ | ------------ | ------------ | ------------ | ------------ | ---- |
| 10K | 00 hr 02 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min | n/a |
| 50K | 00 hr 02 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min | n/a |
| 100K | 00 hr 02 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min | n/a |
| 1M | 00 hr 02 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min | n/a |
| 10M | 00 hr 20 min | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | n/a |
| 100M | 03 hr 00 min | 03 hr 00 min | 03 hr 00 min | 04 hr 00 min | n/a | n/a |
| 200M | 06 hr 00 min | 06 hr 00 min | n/a | n/a | n/a | n/a |
| 300M | 10 hr 00 min | n/a | n/a | n/a | n/a | n/a |
| 400M | n/a | n/a | n/a | n/a | n/a | n/a |

- r5.4xlarge

| R / D | 1M | 50M | 70M | 80M | 100M | 200M |
| ----- | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 10K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 50K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 100K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 1M | 00 hr 02 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 10M | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min | 01 hr 00 min |
| 100M | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min |
| 200M | 02 hr 00 min | 03 hr 00 min | 03 hr 00 min | 03 hr 00 min | 03 hr 00 min | n/a |
| 300M | 04 hr 00 min | 04 hr 00 min | 04 hr 00 min | 04 hr 00 min | 04 hr 00 min | n/a |
| 400M | 04 hr 00 min | n/a | n/a | n/a | n/a | n/a |

- r5.8xlarge

| R / D | 1M | 50M | 70M | 80M | 100M | 200M |
| ----- | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 10K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 50K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 100K | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 1M | 00 hr 01 min | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min |
| 10M | 00 hr 10 min | 00 hr 20 min | 00 hr 20 min | 00 hr 20 min | 01 hr 00 min | 01 hr 00 min |
| 100M | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 01 hr 00 min | 02 hr 00 min |
| 200M | 01 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min |
| 300M | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 02 hr 00 min | 03 hr 00 min |
| 400M | 03 hr 00 min | 03 hr 00 min | 03 hr 00 min | 03 hr 00 min | 03 hr 00 min | 03 hr 00 min |
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@
public enum AggregationWorkerReturnCode {
/**
* Unable to process the job because the user exhausted the allocated budget to aggregate the
* reports in this batch. This error is not transient and the job cannot be retried.
* reports in this batch. This can happen if some reports were already processed in an earlier
* batch. This error is not transient and the job cannot be retried.
*/
PRIVACY_BUDGET_EXHAUSTED,

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ public ListenableFuture<ImmutableSet<BigInteger>> readAndDedupDomain(
String.format(
"No output domain provided in the location. : %s. Please refer to the API"
+ " documentation for output domain parameters at"
+ " https://github.com/privacysandbox/aggregation-service/blob/main/docs/API.md",
+ " https://github.com/privacysandbox/aggregation-service/blob/main/docs/api.md",
outputDomainLocation)));
}
return domain;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ public static void validate(Optional<Job> job, boolean domainOptional) {
"Job parameters for the job '%s' does not have output domain location specified in"
+ " 'output_domain_bucket_name' and 'output_domain_blob_prefix' fields. Please"
+ " refer to the API documentation for output domain parameters at"
+ " https://github.com/privacysandbox/aggregation-service/blob/main/docs/API.md",
+ " https://github.com/privacysandbox/aggregation-service/blob/main/docs/api.md",
jobKey));
String reportErrorThreshold =
jobParams.getOrDefault(JOB_PARAM_REPORT_ERROR_THRESHOLD_PERCENTAGE, null);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ public void worker_domainOptionalFalse_noOutputDomainPath_throwsException() thro
"Job parameters for the job 'request' does not have output domain location specified in"
+ " 'output_domain_bucket_name' and 'output_domain_blob_prefix' fields. Please"
+ " refer to the API documentation for output domain parameters at"
+ " https://github.com/privacysandbox/aggregation-service/blob/main/docs/API.md");
+ " https://github.com/privacysandbox/aggregation-service/blob/main/docs/api.md");
}

@Test
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ public void validate_noOutputDomain_domainNotOptional_fails() {
"Job parameters for the job '' does not have output domain location specified in"
+ " 'output_domain_bucket_name' and 'output_domain_blob_prefix' fields. Please"
+ " refer to the API documentation for output domain parameters at"
+ " https://github.com/privacysandbox/aggregation-service/blob/main/docs/API.md");
+ " https://github.com/privacysandbox/aggregation-service/blob/main/docs/api.md");
}

@Test
Expand Down

0 comments on commit 7315ef8

Please sign in to comment.