Skip to content

Commit

Permalink
Release version v0.4.0
Browse files Browse the repository at this point in the history
  - Kokoro release: Cut off new release v0.4.0.
  - Add sampledata from github repo to SOT
  - cleanup versions
  - Add a generic recommendation to shard input
  - build folder rename
  - patch scp repo for reproducible builds

NOKEYCHECK=True
GitOrigin-RevId: d8992516453120a6354f1f2b42e5aee50dc5820e
  • Loading branch information
Privacy Sandbox Team authored and taoliaoleo committed Oct 13, 2022
1 parent 8ba4300 commit 8e065c1
Show file tree
Hide file tree
Showing 576 changed files with 69,251 additions and 4,301 deletions.
6 changes: 6 additions & 0 deletions .bazelignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Ignore build rules in top level 'target' folder created by maven builds.
target
# Ignore build rules in top level 'out' folder created by IntelliJ.
out

tools/load_tests/
8 changes: 8 additions & 0 deletions .bazelrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
test --test_output=all --java_runtime_version=remotejdk_11
build --java_language_version=11
build --java_toolchain=@bazel_tools//tools/jdk:toolchain_java11
# By default the JVM uses a memory-mapped file for the PerfData structure so that tools can easily access the data.
# It will create issues when JVM unlinks the file at shutdown and may trigger a sandbox fault http://b/205838938.
# -XX:+PerfDisableSharedMem to force JVM uses regular memory instead.
# -XX:-UsePerfData to disable /tmp/hsperfdata references. We don't use the perf data here so we disable it
build --jvmopt="-XX:-UsePerfData"
1 change: 1 addition & 0 deletions .bazelversion
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
4.2.2
34 changes: 28 additions & 6 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,29 @@
# Terraform ignores
terraform.tfstate*
.terraform/
crds.yml
bazel-*
!bazel-buildifier
.project
/.ijwb/
/.aswb/
/.clwb/
.vscode
.idea/
*.iml
*~
.terraform
# NOTE: .terraform.lock.hcl should be checked into git
# https://www.terraform.io/docs/language/dependency-lock.html#lock-file-location

# downloaded artifacts
jars
# Ignore generated files for eclipse lsp
/pom.xml
/.settings
/.classpath
/.factorypath
/target
/bin

# don't track prebuilt/local-built artifacts and terrafrom release-manged scripts
/terrafrom/aws/control-plane-shared-libraries
/terraform/aws/jars
/terrafrom/aws/environments/applications
/terrafrom/aws/environments/modules
/terrafrom/aws/environments/shared
/terrafrom/aws/environments/demo
23 changes: 9 additions & 14 deletions terraform/aws/download-dependencies.sh → BUILD
100755 → 100644
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
#!/bin/sh
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
Expand All @@ -13,20 +12,16 @@
# See the License for the specific language governing permissions and
# limitations under the License.

package(default_visibility = ["//visibility:public"])

mkdir -p jars
load("@com_github_bazelbuild_buildtools//buildifier:def.bzl", "buildifier")

gcs_url="https://storage.googleapis.com/trusted-execution-aggregation-service-public-artifacts"
buildifier(
name = "buildifier_check",
mode = "check",
)

jars=(
"AwsChangeHandlerLambda"
"aws_apigateway_frontend"
"AwsFrontendCleanupLambda"
"AsgCapacityHandlerLambda"
"TerminatedInstanceHandlerLambda"
buildifier(
name = "buildifier_fix",
mode = "fix",
)
version=`cat ../../VERSION`
for jar in "${jars[@]}"; do
echo "Downloading ${jar}_${version}.jar ..."
curl -o jars/${jar}_$version.jar ${gcs_url}/${version}/${jar}_${version}.jar
done
15 changes: 0 additions & 15 deletions CHANGELOG.md

This file was deleted.

10 changes: 5 additions & 5 deletions COLLECTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ bytes may be represented as ASCII characters, others are unicode escaped.

The [sample report](#aggregatable-report-sample) lists a `debug_cleartext_payload`
field that is *not* encrypted and can be processed with the
[local testing tool](https://storage.googleapis.com/trusted-execution-aggregation-service-public-artifacts/0.3.0/LocalTestingTool_0.3.0.jar).
[local testing tool](https://storage.googleapis.com/control-plane-shared-libraries-public-artifacts/{VERSION}/LocalTestingTool_{VERSION}.jar).

When testing the aggregation service locally and on Amazon Web Services
[Nitro Enclaves](https://aws.amazon.com/ec2/nitro/nitro-enclaves/),
Expand Down Expand Up @@ -305,7 +305,7 @@ in a domain file `output_domain.avro` with the following Avro schema.
You can use the [Avro Tools](https://www.apache.org/dyn/closer.cgi/avro/) to
generate a `output_domain.avro` from a JSON input file.
You can download the Avro Tools jar 1.11.1 [here](http://archive.apache.org/dist/avro/avro-1.11.1/java/avro-tools-1.11.1.jar)
You can download the Avro Tools jar 1.11.0 [here](https://dlcdn.apache.org/avro/avro-1.11.0/java/avro-tools-1.11.0.jar)
We use the following `output_domain.json` input file to generate our
`output_domain.avro` file. This uses the bucket from the above
Expand All @@ -321,16 +321,16 @@ unicode escaped "characters" to encode the byte array bucket value.
To generate the `output_domain.avro` file use the above JSON file and domain schema file:
```sh
java -jar avro-tools-1.11.1.jar fromjson \
java -jar avro-tools-1.11.0.jar fromjson \
--schema-file output_domain.avsc output_domain.json > output_domain.avro
```
### Produce a summary report locally
Using the [local testing tool](https://storage.googleapis.com/trusted-execution-aggregation-service-public-artifacts/0.3.0/LocalTestingTool_0.3.0.jar),
Using the [local testing tool](https://storage.googleapis.com/control-plane-shared-libraries-public-artifacts/{VERSION}/LocalTestingTool_{VERSION}.jar),
you now can generate a summary report. [See all flags and descriptions](./API.md#local-testing-tool)
*Note: The `SHA256` of the `LocalTestingTool_{version}.jar` is `f3da41b974341863b6d58de37b7eda34f0e9b85fe074ee829d41be2afea5d19a`
*Note: The `SHA256` of the `LocalTestingTool_{version}.jar` is `{LocalTestingTool_{version}.jar-SHA}`
obtained with `openssl sha256 <jar>`.*
We will run the tool, without adding noise to the summary report, to receive the
Expand Down
3 changes: 3 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# How to Contribute

Presently this project is not accepting contributions.
120 changes: 120 additions & 0 deletions DEBUGGING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# Debug aggregation runs with encrypted payloads

This document describes the debugging support for the Aggregation Service running in a TEE using encrypted payloads of aggregatable reports. This allows you to debug your production setup and understand how the encrypted payloads of aggregatable reports are processed. Reports with debug_cleartext_payload can be used with the [local testing tool](./README.md#using-the-local-testing-tool) and are helpful for understanding the content of reports and validating that registrations on the browser client or device are configured properly.

To test the Aggregation Service, you can enable debug aggregation runs which use encrypted payloads of aggregatable reports to generate debug summary reports. When executing a debug run, no noise is added to the debug summary report, and annotations are added to indicate whether keys are present in domain input and/or reports. This allows developers to:

* Analyze the reports
* Determine if the aggregation was completed correctly, per the adtech’s specifications
* Understand the impact of noise in summary reports
* Determine how to set the proper domain keys

Additionally, debug runs do not enforce the [No-Duplicates rule](https://github.com/WICG/attribution-reporting-api/blob/main/AGGREGATION_SERVICE_TEE.md#no-duplicates-rule) across batches. The No-Duplicates rule is still enforced within a batch. This allows adtech to try different batches without worrying about making them [disjoint](https://github.com/WICG/attribution-reporting-api/blob/main/AGGREGATION_SERVICE_TEE.md#disjoint-batches) during testing or debugging.

Once third-party cookies are deprecated, the client (browser or operating system) will no longer generate aggregatable reports that are enabled for debugging. At that time, debug runs with encrypted payloads will no longer be supported for reports from real user devices or browsers.

In this document, you’ll find code snippets and instructions for how to debug the Aggregation Service and create debug summary reports.

## Create a debug job

To create an aggregation debug job, add the `debug_run` parameter to the `job_parameters` object of the `createJob` API request.

`POST https://<frontend_api_id>.execute-api.us-east-1.amazonaws.com/stage/v1alpha/createJob`

```json
{
"input_data_blob_prefix":"input/reports.avro",
"input_data_bucket_name":"<data_bucket>",
"output_data_blob_prefix":"output/summary_report.avro",
"output_data_bucket_name":"<data_bucket>",
"job_parameters":{
"attribution_report_to":"<your_attribution_domain>",
"output_domain_blob_prefix":"domain/domain.avro",
"output_domain_bucket_name":"<data_bucket>",
"debug_run":"true"
},
"Job_request_id":"test01"
}
```

If `debug_run` is not present in` job_parameters` or it's set to `false`, a normal noised aggregation run is created. More details about `createJob` API can be found in [detailed API spec](./API.md#createjob-endpoint).

## Debuggable aggregatable reports

A debug run only considers reports that have the flag `"debug_mode": "enabled"` in the report shared_info ([aggregatable report sample](./COLLECTING.md#aggregatable-report-sample)). Reports with the `debug_mode` flag missing or the `debug_mode` value isn’t set to `enabled` aren’t included in the results generated by a debug run.

The count of reports that were not processed during a debug run is returned in the job response, which can be previewed in the [detailed API spec](./API.md#createjob-endpoint). In the `error_counts` field, the category `NUM_REPORTS_DEBUG_NOT_ENABLED` shows the numbers of reports not processed during a debug run.


## Results

Two summary reports are generated from a debug run: a regular summary report and a debug summary report. The regular summary report format generated from the debug run is consistent with that of a regular aggregation run. The debug summary report has a [different format](#debug-summary-report). The path of the summary report is set in the [createJob](./API.md#createjob-endpoint) API. The debug summary report will be stored in the "debug" folder under the summary report's path with the same object name.

Considering the following createJob parameters for `output_data_bucket_name` and `output_data_blob_prefix`:

```json
"output_data_blob_prefix": "output/summary_report.avro",
"output_data_bucket_name": "<data_bucket>",
```

the following objects are created by a debug run:

`s3://<data_bucket>/output/summary_report.avro` and

`s3://<data_bucket>/output/debug/summary_report.avro`.

Note that the regular summary report generated during a debug run will only include reports which have the flag `"debug_mode": "enabled"` in the reports `shared_info`.

### Debug summary report

The debug summary report includes the following data:

* `bucket`: The aggregation key
* `unnoised_metric`: The aggregation value without noise
* `noise`: The approximate noise applied to the aggregated results in the regular summary report
* `annotations`: The annotations associated with the bucket

The keys in the debug summary report will include all the keys from the output domain.

If the key is only present in the output domain (not in any of the processed aggregatable reports), the key will be included in the debug report with `unnoised_metric=0` and `annotations=["in_domain"]`.

The keys that are only present in aggregatable reports (not in output domain) will also be included in the debug report with `unnoised_metric=<unnoised aggregated value>` and `annotations=["in_reports"]`.

Keys that are present in domain and aggregatable reports will have annotations for both `[“in_domain”, “in_reports”]`.

The schema of debug summary reports is in the following [Avro](https://avro.apache.org/) format:

```avro
{
"type": "record",
"name": "DebugAggregatedFact",
"fields": [
{
"name": "bucket",
"type": "bytes",
"doc": "Histogram bucket used in aggregation. 128-bit integer encoded as a 16-byte big-endian bytestring. Leading 0-bits will be left out."
},
{
"name": "unnoised_metric",
"type": "long",
"doc": "Unnoised metric associated with the bucket."
},
{
"name": "noise",
"type": "long",
"doc": "The noise applied to the metric in the regular result."
}
{
"name":"annotations",
"type":
{
"type": "array",
"items": {
"type":"enum",
"name":"bucket_tags",
"symbols":["in_domain","in_reports"]
}
}
]
}
```
4 changes: 2 additions & 2 deletions DEPENDENCIES.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

The deployment of the Amazon Web Services [Nitro Enclaves](https://aws.amazon.com/ec2/nitro/nitro-enclaves/) based Aggregation Service depends on several packaged
artifacts listed below.
These artifacts can be downloaded with the [download-dependencies.sh](./terraform/aws/download-dependencies.sh)
script.
These artifacts can be downloaded with the [download_prebuilt_dependencies.sh](./terraform/aws/download_prebuilt_dependencies.sh)
script. See [README](/README.md#download-terraform-scripts-and-prebuilt-dependencies).

## Packaged AWS Lambda Jars

Expand Down
Loading

0 comments on commit 8e065c1

Please sign in to comment.