Release version v0.4.0

- Kokoro release: Cut off new release v0.4.0. - Add sampledata from github repo to SOT - cleanup versions - Add a generic recommendation to shard input - build folder rename - patch scp repo for reproducible builds NOKEYCHECK=True GitOrigin-RevId: d8992516453120a6354f1f2b42e5aee50dc5820e
AuDigent · Oct 13, 2022 · 8e065c1 · 8e065c1
1 parent 8ba4300
commit 8e065c1
Show file tree

Hide file tree

Showing 576 changed files with 69,251 additions and 4,301 deletions.
diff --git a/.bazelignore b/.bazelignore
@@ -0,0 +1,6 @@
+# Ignore build rules in top level 'target' folder created by maven builds.
+target
+# Ignore build rules in top level 'out' folder created by IntelliJ.
+out
+
+tools/load_tests/
diff --git a/.bazelrc b/.bazelrc
@@ -0,0 +1,8 @@
+test --test_output=all --java_runtime_version=remotejdk_11
+build --java_language_version=11
+build --java_toolchain=@bazel_tools//tools/jdk:toolchain_java11
+# By default the JVM uses a memory-mapped file for the PerfData structure so that tools can easily access the data.
+# It will create issues when JVM unlinks the file at shutdown and may trigger a sandbox fault http://b/205838938.
+# -XX:+PerfDisableSharedMem to force JVM uses regular memory instead.
+# -XX:-UsePerfData to disable /tmp/hsperfdata references.  We don't use the perf data here so we disable it
+build --jvmopt="-XX:-UsePerfData"
diff --git a/.bazelversion b/.bazelversion
@@ -0,0 +1 @@
+4.2.2
diff --git a/.gitignore b/.gitignore
@@ -1,7 +1,29 @@
-# Terraform ignores
-terraform.tfstate*
-.terraform/
-crds.yml
+bazel-*
+!bazel-buildifier
+.project
+/.ijwb/
+/.aswb/
+/.clwb/
+.vscode
+.idea/
+*.iml
+*~
+.terraform
+# NOTE: .terraform.lock.hcl should be checked into git
+# https://www.terraform.io/docs/language/dependency-lock.html#lock-file-location
 
-# downloaded artifacts
-jars
+# Ignore generated files for eclipse lsp
+/pom.xml
+/.settings
+/.classpath
+/.factorypath
+/target
+/bin
+
+# don't track prebuilt/local-built artifacts and terrafrom release-manged scripts
+/terrafrom/aws/control-plane-shared-libraries
+/terraform/aws/jars
+/terrafrom/aws/environments/applications
+/terrafrom/aws/environments/modules
+/terrafrom/aws/environments/shared
+/terrafrom/aws/environments/demo
diff --git a/terraform/aws/download-dependencies.sh → BUILD b/terraform/aws/download-dependencies.sh → BUILD
@@ -1,4 +1,3 @@
-#!/bin/sh
 # Copyright 2022 Google LLC
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -13,20 +12,16 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+package(default_visibility = ["//visibility:public"])
 
-mkdir -p jars
+load("@com_github_bazelbuild_buildtools//buildifier:def.bzl", "buildifier")
 
-gcs_url="https://storage.googleapis.com/trusted-execution-aggregation-service-public-artifacts"
+buildifier(
+    name = "buildifier_check",
+    mode = "check",
+)
 
-jars=(
-  "AwsChangeHandlerLambda"
-  "aws_apigateway_frontend"
-  "AwsFrontendCleanupLambda"
-  "AsgCapacityHandlerLambda"
-  "TerminatedInstanceHandlerLambda"
+buildifier(
+    name = "buildifier_fix",
+    mode = "fix",
 )
-version=`cat ../../VERSION`
-for jar in "${jars[@]}"; do
-  echo "Downloading ${jar}_${version}.jar ..."
-  curl -o jars/${jar}_$version.jar ${gcs_url}/${version}/${jar}_${version}.jar
-done
diff --git a/CHANGELOG.md b/CHANGELOG.md
diff --git a/COLLECTING.md b/COLLECTING.md
@@ -69,7 +69,7 @@ bytes may be represented as ASCII characters, others are unicode escaped.
 
 The [sample report](#aggregatable-report-sample) lists a `debug_cleartext_payload`
 field that is *not* encrypted and can be processed with the
-[local testing tool](https://storage.googleapis.com/trusted-execution-aggregation-service-public-artifacts/0.3.0/LocalTestingTool_0.3.0.jar).
+[local testing tool](https://storage.googleapis.com/control-plane-shared-libraries-public-artifacts/{VERSION}/LocalTestingTool_{VERSION}.jar).
 
 When testing the aggregation service locally and on Amazon Web Services
 [Nitro Enclaves](https://aws.amazon.com/ec2/nitro/nitro-enclaves/),
@@ -305,7 +305,7 @@ in a domain file `output_domain.avro` with the following Avro schema.
 You can use the [Avro Tools](https://www.apache.org/dyn/closer.cgi/avro/) to
 generate a `output_domain.avro` from a JSON input file.
 
-You can download the Avro Tools jar 1.11.1 [here](http://archive.apache.org/dist/avro/avro-1.11.1/java/avro-tools-1.11.1.jar)
+You can download the Avro Tools jar 1.11.0 [here](https://dlcdn.apache.org/avro/avro-1.11.0/java/avro-tools-1.11.0.jar)
 
 We use the following `output_domain.json` input file to generate our
 `output_domain.avro` file. This uses the bucket from the above
@@ -321,16 +321,16 @@ unicode escaped "characters" to encode the byte array bucket value.
 To generate the `output_domain.avro` file use the above JSON file and domain schema file:
 
 ```sh
-java -jar avro-tools-1.11.1.jar fromjson \
+java -jar avro-tools-1.11.0.jar fromjson \
 --schema-file output_domain.avsc output_domain.json > output_domain.avro
 ```
 
 ### Produce a summary report locally
 
-Using the [local testing tool](https://storage.googleapis.com/trusted-execution-aggregation-service-public-artifacts/0.3.0/LocalTestingTool_0.3.0.jar),
+Using the [local testing tool](https://storage.googleapis.com/control-plane-shared-libraries-public-artifacts/{VERSION}/LocalTestingTool_{VERSION}.jar),
 you now can generate a summary report. [See all flags and descriptions](./API.md#local-testing-tool)
 
-*Note: The `SHA256` of the `LocalTestingTool_{version}.jar` is `f3da41b974341863b6d58de37b7eda34f0e9b85fe074ee829d41be2afea5d19a`
+*Note: The `SHA256` of the `LocalTestingTool_{version}.jar` is `{LocalTestingTool_{version}.jar-SHA}`
 obtained with `openssl sha256 <jar>`.*
 
 We will run the tool, without adding noise to the summary report, to receive the

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,3 @@
+# How to Contribute
+
+Presently this project is not accepting contributions.
diff --git a/DEBUGGING.md b/DEBUGGING.md
@@ -0,0 +1,120 @@
+# Debug aggregation runs with encrypted payloads
+
+This document describes the debugging support for the Aggregation Service running in a TEE using encrypted payloads of aggregatable reports. This allows you to debug your production setup and understand how the encrypted payloads of aggregatable reports are processed. Reports with debug_cleartext_payload can be used with the [local testing tool](./README.md#using-the-local-testing-tool) and are helpful for understanding the content of reports and validating that registrations on the browser client or device are configured properly.
+
+To test the Aggregation Service, you can enable debug aggregation runs which use encrypted payloads of aggregatable reports to generate debug summary reports. When executing a debug run, no noise is added to the debug summary report, and annotations are added to indicate whether keys are present in domain input and/or reports. This allows developers to:
+
+* Analyze the reports
+* Determine if the aggregation was completed correctly, per the adtech’s specifications
+* Understand the impact of noise in summary reports
+* Determine how to set the proper domain keys
+
+Additionally, debug runs do not enforce the [No-Duplicates rule](https://github.com/WICG/attribution-reporting-api/blob/main/AGGREGATION_SERVICE_TEE.md#no-duplicates-rule) across batches. The No-Duplicates rule is still enforced within a batch. This allows adtech to try different batches without worrying about making them [disjoint](https://github.com/WICG/attribution-reporting-api/blob/main/AGGREGATION_SERVICE_TEE.md#disjoint-batches) during testing or debugging.
+
+Once third-party cookies are deprecated, the client (browser or operating system) will no longer generate aggregatable reports that are enabled for debugging. At that time, debug runs with encrypted payloads will no longer be supported for reports from real user devices or browsers.
+
+In this document, you’ll find code snippets and instructions for how to debug the Aggregation Service and create debug summary reports.
+
+## Create a debug job
+
+To create an aggregation debug job, add the `debug_run` parameter to the `job_parameters` object of the `createJob` API request.
+
+`POST https://<frontend_api_id>.execute-api.us-east-1.amazonaws.com/stage/v1alpha/createJob`
+
+```json
+{
+  "input_data_blob_prefix":"input/reports.avro",
+  "input_data_bucket_name":"<data_bucket>",
+  "output_data_blob_prefix":"output/summary_report.avro",
+  "output_data_bucket_name":"<data_bucket>",
+  "job_parameters":{
+    "attribution_report_to":"<your_attribution_domain>",
+    "output_domain_blob_prefix":"domain/domain.avro",
+    "output_domain_bucket_name":"<data_bucket>",
+    "debug_run":"true"
+},
+  "Job_request_id":"test01"
+}
+```
+
+If `debug_run` is not present in` job_parameters` or it's set to `false`, a normal noised aggregation run is created. More details about `createJob` API can be found in [detailed API spec](./API.md#createjob-endpoint).
+
+## Debuggable aggregatable reports
+
+A debug run only considers reports that have the flag `"debug_mode": "enabled"` in the report shared_info ([aggregatable report sample](./COLLECTING.md#aggregatable-report-sample)). Reports with the `debug_mode` flag missing or the `debug_mode` value isn’t set to `enabled` aren’t included in the results generated by a debug run.
+
+The count of reports that were not processed during a debug run is returned in the job response, which can be previewed in the [detailed API spec](./API.md#createjob-endpoint). In the `error_counts` field, the category `NUM_REPORTS_DEBUG_NOT_ENABLED` shows the numbers of reports not processed during a debug run.
+
+
+## Results
+
+Two summary reports are generated from a debug run: a regular summary report and a debug summary report. The regular summary report format generated from the debug run is consistent with that of a regular aggregation run. The debug summary report has a [different format](#debug-summary-report). The path of the summary report is set in the [createJob](./API.md#createjob-endpoint) API. The debug summary report will be stored in the "debug" folder under the summary report's path with the same object name.
+
+Considering the following createJob parameters for `output_data_bucket_name` and `output_data_blob_prefix`:
+
+```json
+"output_data_blob_prefix": "output/summary_report.avro",
+"output_data_bucket_name": "<data_bucket>",
+```
+
+the following objects are created by a debug run:
+
+`s3://<data_bucket>/output/summary_report.avro` and
+
+`s3://<data_bucket>/output/debug/summary_report.avro`.
+
+Note that the regular summary report generated during a debug run will only include reports which have the flag `"debug_mode": "enabled"` in the reports `shared_info`.
+
+### Debug summary report
+
+The debug summary report includes the following data:
+
+* `bucket`: The aggregation key
+* `unnoised_metric`: The aggregation value without noise
+* `noise`: The approximate noise applied to the aggregated results in the regular summary report
+* `annotations`: The annotations associated with the bucket
+
+The keys in the debug summary report will include all the keys from the output domain.
+
+If the key is only present in the output domain (not in any of the processed aggregatable reports), the key will be included in the debug report with `unnoised_metric=0` and `annotations=["in_domain"]`.
+
+The keys that are only present in aggregatable reports (not in output domain) will also be included in the debug report with `unnoised_metric=<unnoised aggregated value>` and `annotations=["in_reports"]`.
+
+Keys that are present in domain and aggregatable reports will have annotations for both `[“in_domain”, “in_reports”]`.
+
+The schema of debug summary reports is in the following [Avro](https://avro.apache.org/) format:
+
+```avro
+{
+"type": "record",
+"name": "DebugAggregatedFact",
+"fields": [
+    {
+    "name": "bucket",
+    "type": "bytes",
+    "doc": "Histogram bucket used in aggregation. 128-bit integer encoded as a 16-byte big-endian bytestring. Leading 0-bits will be left out."
+    },
+    {
+    "name": "unnoised_metric",
+    "type": "long",
+    "doc": "Unnoised metric associated with the bucket."
+    },
+    {
+    "name": "noise",
+    "type": "long",
+    "doc": "The noise applied to the metric in the regular result."
+    }
+    {
+    "name":"annotations",
+    "type":
+       {
+       "type": "array",
+       "items": {
+         "type":"enum",
+         "name":"bucket_tags",
+         "symbols":["in_domain","in_reports"]
+       }
+    }
+  ]
+}
+```
diff --git a/DEPENDENCIES.md b/DEPENDENCIES.md
@@ -2,8 +2,8 @@
 
 The deployment of the Amazon Web Services [Nitro Enclaves](https://aws.amazon.com/ec2/nitro/nitro-enclaves/) based Aggregation Service depends on several packaged
 artifacts listed below.
-These artifacts can be downloaded with the [download-dependencies.sh](./terraform/aws/download-dependencies.sh)
-script.
+These artifacts can be downloaded with the [download_prebuilt_dependencies.sh](./terraform/aws/download_prebuilt_dependencies.sh)
+script. See [README](/README.md#download-terraform-scripts-and-prebuilt-dependencies).
 
 ## Packaged AWS Lambda Jars
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		# How to Contribute

		Presently this project is not accepting contributions.