diff --git a/examples/tutorials/self-paced-training/part-3_security_and_privacy/chapter-5_Privacy_In_Federated_Learning/05.1_privacy_filter/privacy_filtering.ipynb b/examples/tutorials/self-paced-training/part-3_security_and_privacy/chapter-5_Privacy_In_Federated_Learning/05.1_privacy_filter/privacy_filtering.ipynb
index 515d1bdb66..84cc50b35c 100644
--- a/examples/tutorials/self-paced-training/part-3_security_and_privacy/chapter-5_Privacy_In_Federated_Learning/05.1_privacy_filter/privacy_filtering.ipynb
+++ b/examples/tutorials/self-paced-training/part-3_security_and_privacy/chapter-5_Privacy_In_Federated_Learning/05.1_privacy_filter/privacy_filtering.ipynb
@@ -5,23 +5,11 @@
"id": "1398ef0a-f189-4d04-a8a9-276a17ab0f8b",
"metadata": {},
"source": [
- "# Federated Learning with Differential Privacy\n",
+ "# Privacy Preservation using NVFlare's Filters\n",
"\n",
- "Please make sure you set up a virtual environment and follow [example root readme](../../README.md) before starting this notebook.\n",
- "Then, install the requirements.\n",
+ "[Filters](https://nvflare.readthedocs.io/en/main/programming_guide/filters.html) in NVIDIA FLARE are a type of FLComponent that has a process method to transform the Shareable object between the communicating parties. A Filter can be used to provide additional processing to shareable data before sending or after receiving from the peer.\n",
"\n",
- "
NOTE Some of the cells below generate long text output. We're using
%%capture --no-display --no-stderr cell_output
to suppress this output. Comment or delete this line in the cells below to restore full output.
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "id": "5002e45c-f58e-4f68-bb5a-9626e084947f",
- "metadata": {},
- "outputs": [],
- "source": [
- "%%capture --no-display --no-stderr cell_output\n",
- "!pip install -r requirements.txt"
+ "The `FLContext` is available for the `Filter` to use. Filters can be added to your NVFlare job using the [FedJob API](https://nvflare.readthedocs.io/en/main/programming_guide/fed_job_api.html#fedjob-api) you should be familiar with from previous chapters."
]
},
{
@@ -29,74 +17,61 @@
"id": "bddd90a1-fe96-4f24-b360-bbe73b24e34a",
"metadata": {},
"source": [
- "### Differential Privacy (DP)\n",
- "[Differential Privacy (DP)](https://arxiv.org/abs/1910.00962) [7] is a method for ensuring that Federated Learning (FL) preserves privacy by obfuscating the model updates sent from clients to the central server. \n",
- "This example shows the usage of a CIFAR-10 training code with NVFlare, as well as the usage of DP filters in your FL training. DP is added as a filter in `config_fed_client.json`. Here, we use the \"Sparse Vector Technique\", i.e. the [SVTPrivacy](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.app_common.filters.svt_privacy.html) protocol, as utilized in [Li et al. 2019](https://arxiv.org/abs/1910.00962) [7] (see [Lyu et al. 2016](https://arxiv.org/abs/1603.01699) [8] for more information)."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "9b0c692a-16dc-4ef9-a432-4b7375a2a7d6",
- "metadata": {},
- "source": [
- "## Run experiments with FL simulator\n",
- "### Training with FL simulator\n",
- "FL simulator is used to simulate FL experiments or debug codes, not for real FL deployment.\n",
+ "#### Filters\n",
+ "In NVFlare, filters are used for the pre- and post-processing of a task's data.\n",
"\n",
- "First, train a model using the FedAvg algorithm with four clients without DP."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "fce5fd7e-f911-4a04-81a6-312e43c832c3",
- "metadata": {},
- "outputs": [],
- "source": [
- "!nvflare simulator './configs/brats_fedavg' -w './workspace_brats/brats_fedavg' -n 4 -t 4"
+ "On the **Server** side, before sending the task to the **Client**, “task data filters” (if any) are applied to the task data. Only the filtered task data is sent to the client. Similarly, when the task result is received from the client, “task result filters” are applied to the received result before passing on to the `Controller`.\n",
+ "\n",
+ "On the **Client** side, once a task is received from the Server, “task data filters” (if any) are applied to the task data before passing to the task executor. Similarly, when the task result is computed from the `Executor`, “task result filters” are applied to the task result before sending it to the **Server**.\n",
+ "\n",
+ ""
]
},
{
"cell_type": "markdown",
- "id": "c82a3be9-9e58-44ca-9d3f-e85456de7f12",
- "metadata": {},
- "source": [
- "Run the FL simulator with 4 clients for federated learning with differential privacy by running"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "4f1f9065-129f-4e62-ac3d-a1504a3b30bf",
+ "id": "7d299352-28c6-4be6-9297-42a1c8184191",
"metadata": {},
- "outputs": [],
"source": [
- "!nvflare simulator './configs/brats_fedavg_dp' -w './workspace_brats/brats_fedavg_dp' -n 4 -t 4"
+ "#### Examples of Filters\n",
+ "Filters are the primary technique for data privacy protection.\n",
+ "\n",
+ "Filters can convert data formats and a lot more. You can apply any type of massaging to the data for the purpose of security. In fact, privacy and homomorphic encryption techniques are all implemented as filters:\n",
+ "\n",
+ "ExcludeVars to exclude variables from shareable (`nvflare.app_common.filters.exclude_vars`)\n",
+ "\n",
+ "PercentilePrivacy for truncation of weights by percentile (`nvflare.app_common.filters.percentile_privacy`)\n",
+ "\n",
+ "SVTPrivacy for differential privacy through sparse vector techniques (`nvflare.app_common.filters.svt_privacy`)\n",
+ "\n",
+ "Homomorphic encryption filters to encrypt data before sharing (`nvflare.app_common.homomorphic_encryption.he_model_encryptor` and `nvflare.app_common.homomorphic_encryption.he_model_decryptor`)"
]
},
{
"cell_type": "markdown",
- "id": "7118e9a1-85fc-4e5a-8b29-fb5f50a4f941",
+ "id": "9b0c692a-16dc-4ef9-a432-4b7375a2a7d6",
"metadata": {},
"source": [
- "### Testing with FL simulator\n",
- "The best global models are stored at\n",
- "```\n",
- "workspace_brats/[job]/simulated_job/app_server/best_FL_global_model.pt\n",
- "```\n",
+ "#### Adding a Filter with the JobAPI\n",
+ "You can add `Filters` to an NVFlare job using the `job.to()` method by specifying which tasks the filter applies to and when to apply it, **before** or **after** the task.\n",
+ "\n",
+ "The behavior can be selected by using the [FilterType](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.job_config.defs.html#nvflare.job_config.defs.FilterType). Users must specify the filter type as either `FilterType.TASK_RESULT` (flow from executor to controller) or `FilterType.TASK_DATA` (flow from controller to executor).\n",
"\n",
- "Please then add the correct paths to the testing script, and run"
+ "The filter will be added \"task_data_filters\" and task_result_filters accordingly and be applied to the specified tasks (defaults to “[*]” for all tasks).\n",
+ "\n",
+ "For example, you can add a privacy filter as such.\n",
+ "```python\n",
+ "pp_filter = PercentilePrivacy(percentile=10, gamma=0.01)\n",
+ "job.to(pp_filter, \"site-1\", tasks=[\"train\"], filter_type=FilterType.TASK_RESULT)\n",
+ "```"
]
},
{
- "cell_type": "code",
- "execution_count": null,
- "id": "e926d179-4063-4f27-9815-b9e3f9569067",
+ "cell_type": "markdown",
+ "id": "351e067f-495a-45d9-bcfd-d8031584cffb",
"metadata": {},
- "outputs": [],
"source": [
- "!cd ./result_stat\n",
- "!bash testing_models_3d.sh"
+ "#### Writing Your Own Filter\n",
+ "For writing your own filter, you can utilize the [DXOFilter](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.apis.dxo_filter.html#nvflare.apis.dxo_filter.DXOFilter) base class. For details see the [documentation](https://nvflare.readthedocs.io/en/main/programming_guide/filters.html). "
]
}
],
diff --git a/examples/tutorials/self-paced-training/part-3_security_and_privacy/chapter-5_Privacy_In_Federated_Learning/05.2_differency_privacy/privacy_with_differential_privacy.ipynb b/examples/tutorials/self-paced-training/part-3_security_and_privacy/chapter-5_Privacy_In_Federated_Learning/05.2_differency_privacy/privacy_with_differential_privacy.ipynb
index 924fd7e69f..a03af803af 100644
--- a/examples/tutorials/self-paced-training/part-3_security_and_privacy/chapter-5_Privacy_In_Federated_Learning/05.2_differency_privacy/privacy_with_differential_privacy.ipynb
+++ b/examples/tutorials/self-paced-training/part-3_security_and_privacy/chapter-5_Privacy_In_Federated_Learning/05.2_differency_privacy/privacy_with_differential_privacy.ipynb
@@ -217,7 +217,7 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 7,
"id": "4ebc6f36-4d0c-41ac-8540-32dac533043a",
"metadata": {
"scrolled": true
@@ -693,7 +693,115 @@
"2025-02-24 19:17:43,114 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 0.805\n",
"2025-02-24 19:17:44,116 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 0.810\n",
"2025-02-24 19:17:52,321 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 0.831\n",
- "2025-02-24 19:17:53,275 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 0.817\n"
+ "2025-02-24 19:17:53,275 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 0.817\n",
+ "2025-02-24 19:18:01,573 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 0.854\n",
+ "2025-02-24 19:18:02,465 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 0.834\n",
+ "2025-02-24 19:18:10,809 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 0.849\n",
+ "2025-02-24 19:18:11,621 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 0.869\n",
+ "2025-02-24 19:18:19,808 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 0.875\n",
+ "2025-02-24 19:18:20,590 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 0.873\n",
+ "2025-02-24 19:18:31,488 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 0.757\n",
+ "2025-02-24 19:18:32,028 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 0.739\n",
+ "2025-02-24 19:18:40,641 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 0.799\n",
+ "2025-02-24 19:18:41,086 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 0.803\n",
+ "2025-02-24 19:18:49,622 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 0.825\n",
+ "2025-02-24 19:18:50,343 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 0.796\n",
+ "2025-02-24 19:18:59,010 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 0.807\n",
+ "2025-02-24 19:18:59,839 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 0.819\n",
+ "2025-02-24 19:19:08,335 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 0.814\n",
+ "2025-02-24 19:19:09,119 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 0.844\n",
+ "2025-02-24 19:19:17,692 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 0.835\n",
+ "2025-02-24 19:19:18,343 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 0.832\n",
+ "2025-02-24 19:19:20,124 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:19:20,867 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:19:28,372 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 64 %\n",
+ "2025-02-24 19:19:28,376 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:19:28,598 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=944f7288-d797-4017-8729-cb3a0937b9ec]: finished processing task\n",
+ "2025-02-24 19:19:28,598 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=944f7288-d797-4017-8729-cb3a0937b9ec]: try #1: sending task result to server\n",
+ "2025-02-24 19:19:28,598 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=944f7288-d797-4017-8729-cb3a0937b9ec]: checking task ...\n",
+ "2025-02-24 19:19:28,599 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:19:28,604 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=944f7288-d797-4017-8729-cb3a0937b9ec]: start to send task result to server\n",
+ "2025-02-24 19:19:28,605 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:19:28,610 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job]: got result from client site-2 for task: name=train, id=944f7288-d797-4017-8729-cb3a0937b9ec\n",
+ "2025-02-24 19:19:28,610 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=944f7288-d797-4017-8729-cb3a0937b9ec]: validation metric 64 from client site-2\n",
+ "2025-02-24 19:19:28,694 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=944f7288-d797-4017-8729-cb3a0937b9ec]: finished processing client result by controller\n",
+ "2025-02-24 19:19:28,694 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-2 task_id:944f7288-d797-4017-8729-cb3a0937b9ec\n",
+ "2025-02-24 19:19:28,695 - Communicator - INFO - SubmitUpdate size: 251.4KB (251449 Bytes). time: 0.090678 seconds\n",
+ "2025-02-24 19:19:28,696 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=944f7288-d797-4017-8729-cb3a0937b9ec]: task result sent to server\n",
+ "2025-02-24 19:19:28,696 - ClientTaskWorker - INFO - Finished one task run for client: site-2 interval: 2 task_processed: True\n",
+ "2025-02-24 19:19:28,903 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 64 %\n",
+ "2025-02-24 19:19:28,906 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:19:29,162 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=4ee559a9-2896-456d-937b-04cc3c7e0399]: finished processing task\n",
+ "2025-02-24 19:19:29,162 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=4ee559a9-2896-456d-937b-04cc3c7e0399]: try #1: sending task result to server\n",
+ "2025-02-24 19:19:29,162 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=4ee559a9-2896-456d-937b-04cc3c7e0399]: checking task ...\n",
+ "2025-02-24 19:19:29,163 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:19:29,169 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=4ee559a9-2896-456d-937b-04cc3c7e0399]: start to send task result to server\n",
+ "2025-02-24 19:19:29,169 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:19:29,174 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job]: got result from client site-1 for task: name=train, id=4ee559a9-2896-456d-937b-04cc3c7e0399\n",
+ "2025-02-24 19:19:29,175 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=4ee559a9-2896-456d-937b-04cc3c7e0399]: validation metric 64 from client site-1\n",
+ "2025-02-24 19:19:29,256 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=4ee559a9-2896-456d-937b-04cc3c7e0399]: finished processing client result by controller\n",
+ "2025-02-24 19:19:29,256 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-1 task_id:4ee559a9-2896-456d-937b-04cc3c7e0399\n",
+ "2025-02-24 19:19:29,258 - Communicator - INFO - SubmitUpdate size: 251.4KB (251449 Bytes). time: 0.088624 seconds\n",
+ "2025-02-24 19:19:29,258 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=4ee559a9-2896-456d-937b-04cc3c7e0399]: task result sent to server\n",
+ "2025-02-24 19:19:29,259 - ClientTaskWorker - INFO - Finished one task run for client: site-1 interval: 2 task_processed: True\n",
+ "2025-02-24 19:19:29,294 - WFCommServer - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: task train exit with status TaskCompletionStatus.OK\n",
+ "2025-02-24 19:19:29,295 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=4ee559a9-2896-456d-937b-04cc3c7e0399]: aggregating 2 update(s) at round 4\n",
+ "2025-02-24 19:19:29,296 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=4ee559a9-2896-456d-937b-04cc3c7e0399]: Start persist model on server.\n",
+ "2025-02-24 19:19:29,299 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=4ee559a9-2896-456d-937b-04cc3c7e0399]: End persist model on server.\n",
+ "2025-02-24 19:19:29,300 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=4ee559a9-2896-456d-937b-04cc3c7e0399]: Finished FedAvg.\n",
+ "2025-02-24 19:19:29,300 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Workflow: controller finalizing ...\n",
+ "2025-02-24 19:19:29,300 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: ABOUT_TO_END_RUN fired\n",
+ "2025-02-24 19:19:29,303 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:19:29,305 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: received request from Server to end current RUN\n",
+ "2025-02-24 19:19:29,306 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: received request from Server to end current RUN\n",
+ "2025-02-24 19:19:29,804 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:19:30,305 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:19:30,701 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job]: server runner is finalizing - asked client to end the run\n",
+ "2025-02-24 19:19:30,702 - GetTaskCommand - INFO - return task to client. client_name: site-2 task_name: __end_run__ task_id: sharable_header_task_id: \n",
+ "2025-02-24 19:19:30,704 - FederatedClient - INFO - pull_task completed. Task name:__end_run__ Status:True \n",
+ "2025-02-24 19:19:30,704 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: server asked to end the run\n",
+ "2025-02-24 19:19:30,704 - ClientRunner - INFO - [identity=site-2, run=simulate_job]: started end-run events sequence\n",
+ "2025-02-24 19:19:30,704 - ClientRunner - INFO - [identity=site-2, run=simulate_job]: ABOUT_TO_END_RUN fired\n",
+ "2025-02-24 19:19:30,704 - ClientRunner - INFO - [identity=site-2, run=simulate_job]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:19:30,705 - InProcessClientAPI - WARNING - ask to stop job: reason: END_RUN received\n",
+ "2025-02-24 19:19:30,806 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:19:30,879 - InProcessClientAPI - WARNING - request to stop the job for reason END_RUN received\n",
+ "2025-02-24 19:19:30,883 - ClientRunner - INFO - [identity=site-2, run=simulate_job]: END_RUN fired\n",
+ "2025-02-24 19:19:30,883 - ClientTaskWorker - INFO - End the Simulator run.\n",
+ "2025-02-24 19:19:30,924 - ClientTaskWorker - INFO - Clean up ClientRunner for : site-2 \n",
+ "2025-02-24 19:19:30,926 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00006 Not Connected] is closed PID: 3593243\n",
+ "2025-02-24 19:19:30,926 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00002 Not Connected] is closed PID: 3593266\n",
+ "2025-02-24 19:19:31,261 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job]: server runner is finalizing - asked client to end the run\n",
+ "2025-02-24 19:19:31,262 - GetTaskCommand - INFO - return task to client. client_name: site-1 task_name: __end_run__ task_id: sharable_header_task_id: \n",
+ "2025-02-24 19:19:31,263 - FederatedClient - INFO - pull_task completed. Task name:__end_run__ Status:True \n",
+ "2025-02-24 19:19:31,263 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: server asked to end the run\n",
+ "2025-02-24 19:19:31,263 - ClientRunner - INFO - [identity=site-1, run=simulate_job]: started end-run events sequence\n",
+ "2025-02-24 19:19:31,263 - ClientRunner - INFO - [identity=site-1, run=simulate_job]: ABOUT_TO_END_RUN fired\n",
+ "2025-02-24 19:19:31,263 - ClientRunner - INFO - [identity=site-1, run=simulate_job]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:19:31,264 - InProcessClientAPI - WARNING - ask to stop job: reason: END_RUN received\n",
+ "2025-02-24 19:19:31,307 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:19:31,407 - InProcessClientAPI - WARNING - request to stop the job for reason END_RUN received\n",
+ "2025-02-24 19:19:31,411 - ClientRunner - INFO - [identity=site-1, run=simulate_job]: END_RUN fired\n",
+ "2025-02-24 19:19:31,411 - ClientTaskWorker - INFO - End the Simulator run.\n",
+ "2025-02-24 19:19:31,452 - ClientTaskWorker - INFO - Clean up ClientRunner for : site-1 \n",
+ "2025-02-24 19:19:31,453 - FederatedClient - INFO - Shutting down client run: site-1\n",
+ "2025-02-24 19:19:31,453 - FederatedClient - INFO - Shutting down client run: site-2\n",
+ "2025-02-24 19:19:31,454 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: asked to abort - triggered abort_signal to stop the RUN\n",
+ "2025-02-24 19:19:31,455 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00002 Not Connected] is closed PID: 3593265\n",
+ "2025-02-24 19:19:31,455 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00005 Not Connected] is closed PID: 3593243\n",
+ "2025-02-24 19:19:33,315 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: END_RUN fired\n",
+ "2025-02-24 19:19:33,316 - ReliableMessage - INFO - ReliableMessage is shutdown\n",
+ "2025-02-24 19:19:33,316 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Server runner finished.\n",
+ "2025-02-24 19:19:33,901 - ReliableMessage - INFO - shutdown reliable message monitor\n",
+ "2025-02-24 19:19:33,952 - SimulatorServer - INFO - Server app stopped.\n",
+ "\n",
+ "\n",
+ "2025-02-24 19:19:34,265 - nvflare.fuel.hci.server.hci - INFO - Admin Server localhost on Port 49271 shutdown!\n",
+ "2025-02-24 19:19:34,265 - SimulatorServer - INFO - shutting down server\n",
+ "2025-02-24 19:19:34,265 - SimulatorServer - INFO - canceling sync locks\n",
+ "2025-02-24 19:19:34,265 - SimulatorServer - INFO - server off\n",
+ "2025-02-24 19:19:37,459 - MPM - WARNING - #### MPM: still running thread Thread-9\n",
+ "2025-02-24 19:19:37,460 - MPM - INFO - MPM: Good Bye!\n"
]
}
],
@@ -717,7 +825,7 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 8,
"id": "330e6fca-8098-4be4-8d75-6b5e7ab1869d",
"metadata": {},
"outputs": [],
@@ -725,7 +833,7 @@
"from nvflare import FilterType\n",
"from nvflare.app_common.filters import SVTPrivacy\n",
"\n",
- "# Create BaseFedJob with initial model\n",
+ "# Create BaseFedJob with the initial model\n",
"job = BaseFedJob(\n",
" name=\"cifar10_fedavg_dp\",\n",
" initial_model=Net(),\n",
@@ -760,12 +868,631 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 9,
"id": "fc6b911d-a171-49b1-ad2e-b0d73032110c",
"metadata": {
"scrolled": true
},
- "outputs": [],
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "2025-02-24 19:19:37,954 - SimulatorRunner - INFO - Create the Simulator Server.\n",
+ "2025-02-24 19:19:37,956 - CoreCell - INFO - server: creating listener on tcp://0:58061\n",
+ "2025-02-24 19:19:37,976 - CoreCell - INFO - server: created backbone external listener for tcp://0:58061\n",
+ "2025-02-24 19:19:37,976 - ConnectorManager - INFO - 3595083: Try start_listener Listener resources: {'secure': False, 'host': 'localhost'}\n",
+ "2025-02-24 19:19:37,977 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connector [CH00002 PASSIVE tcp://0:3458] is starting\n",
+ "2025-02-24 19:19:38,478 - CoreCell - INFO - server: created backbone internal listener for tcp://localhost:3458\n",
+ "2025-02-24 19:19:38,478 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connector [CH00001 PASSIVE tcp://0:58061] is starting\n",
+ "2025-02-24 19:19:38,479 - SimulatorServer - INFO - max_reg_duration=60.0\n",
+ "2025-02-24 19:19:38,549 - nvflare.fuel.hci.server.hci - INFO - Starting Admin Server localhost on Port 49545\n",
+ "2025-02-24 19:19:38,549 - SimulatorRunner - INFO - Deploy the Apps.\n",
+ "2025-02-24 19:19:38,553 - SimulatorRunner - INFO - Create the simulate clients.\n",
+ "2025-02-24 19:19:38,555 - Communicator - INFO - Trying to register with server ...\n",
+ "2025-02-24 19:19:38,556 - ClientManager - INFO - authenticated client site-1\n",
+ "2025-02-24 19:19:38,556 - ClientManager - INFO - Client: New client site-1@192.168.1.203 joined. Sent token: a6b863ee-a4ad-439f-ad18-489238e1a3f9. Total clients: 1\n",
+ "2025-02-24 19:19:38,556 - Communicator - INFO - register RC: ok\n",
+ "2025-02-24 19:19:38,556 - FederatedClient - INFO - Successfully registered client:site-1 for project simulator_server. Token:a6b863ee-a4ad-439f-ad18-489238e1a3f9 SSID:\n",
+ "2025-02-24 19:19:38,557 - Communicator - INFO - Trying to register with server ...\n",
+ "2025-02-24 19:19:38,558 - ClientManager - INFO - authenticated client site-2\n",
+ "2025-02-24 19:19:38,558 - ClientManager - INFO - Client: New client site-2@192.168.1.203 joined. Sent token: 4b189806-5f94-48b9-b374-8a31e8e99b98. Total clients: 2\n",
+ "2025-02-24 19:19:38,558 - Communicator - INFO - register RC: ok\n",
+ "2025-02-24 19:19:38,558 - FederatedClient - INFO - Successfully registered client:site-2 for project simulator_server. Token:4b189806-5f94-48b9-b374-8a31e8e99b98 SSID:\n",
+ "2025-02-24 19:19:38,558 - SimulatorRunner - INFO - Set the client status ready.\n",
+ "2025-02-24 19:19:38,558 - SimulatorRunner - INFO - Deploy and start the Server App.\n",
+ "2025-02-24 19:19:38,560 - Cell - INFO - Register blob CB for channel='server_command', topic='*'\n",
+ "2025-02-24 19:19:38,560 - Cell - INFO - Register blob CB for channel='aux_communication', topic='*'\n",
+ "2025-02-24 19:19:38,560 - ServerCommandAgent - INFO - ServerCommandAgent cell register_request_cb: server.simulate_job\n",
+ "2025-02-24 19:19:38,567 - IntimeModelSelector - INFO - model selection weights control: {}\n",
+ "2025-02-24 19:19:39,752 - AuxRunner - INFO - registered aux handler for topic __sync_runner__\n",
+ "2025-02-24 19:19:39,752 - AuxRunner - INFO - registered aux handler for topic __job_heartbeat__\n",
+ "2025-02-24 19:19:39,752 - AuxRunner - INFO - registered aux handler for topic __task_check__\n",
+ "2025-02-24 19:19:39,753 - AuxRunner - INFO - registered aux handler for topic RM.RELIABLE_REQUEST\n",
+ "2025-02-24 19:19:39,753 - AuxRunner - INFO - registered aux handler for topic RM.RELIABLE_REPLY\n",
+ "2025-02-24 19:19:39,754 - ReliableMessage - INFO - enabled reliable message: max_request_workers=20 query_interval=2.0\n",
+ "2025-02-24 19:19:39,754 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job]: Server runner starting ...\n",
+ "2025-02-24 19:19:39,755 - TBAnalyticsReceiver - INFO - [identity=simulator_server, run=simulate_job]: Tensorboard records can be found in /tmp/nvflare/cifar10_fedavg_dp/server/simulate_job/tb_events you can view it using `tensorboard --logdir=/tmp/nvflare/cifar10_fedavg_dp/server/simulate_job/tb_events`\n",
+ "2025-02-24 19:19:39,755 - AuxRunner - INFO - registered aux handler for topic fed.event\n",
+ "2025-02-24 19:19:39,755 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job]: starting workflow controller () ...\n",
+ "2025-02-24 19:19:39,756 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Initializing BaseModelController workflow.\n",
+ "2025-02-24 19:19:39,756 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Workflow controller () started\n",
+ "2025-02-24 19:19:39,757 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Beginning model controller run.\n",
+ "2025-02-24 19:19:39,757 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Start FedAvg.\n",
+ "2025-02-24 19:19:39,757 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: loading initial model from persistor\n",
+ "2025-02-24 19:19:39,757 - PTFileModelPersistor - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Both source_ckpt_file_full_name and ckpt_preload_path are not provided. Using the default model weights initialized on the persistor side.\n",
+ "2025-02-24 19:19:39,758 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Round 0 started.\n",
+ "2025-02-24 19:19:39,758 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Sampled clients: ['site-1', 'site-2']\n",
+ "2025-02-24 19:19:39,758 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Sending task train to ['site-1', 'site-2']\n",
+ "2025-02-24 19:19:39,759 - WFCommServer - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: scheduled task train\n",
+ "2025-02-24 19:19:40,604 - SimulatorClientRunner - INFO - Start the clients run simulation.\n",
+ "2025-02-24 19:19:41,605 - SimulatorClientRunner - INFO - Simulate Run client: site-1 on GPU group: 0\n",
+ "2025-02-24 19:19:41,605 - SimulatorClientRunner - INFO - Simulate Run client: site-2 on GPU group: 0\n",
+ "2025-02-24 19:19:42,635 - ClientTaskWorker - INFO - ClientTaskWorker started to run\n",
+ "2025-02-24 19:19:42,646 - ClientTaskWorker - INFO - ClientTaskWorker started to run\n",
+ "2025-02-24 19:19:42,702 - CoreCell - INFO - site-1.simulate_job: created backbone external connector to tcp://localhost:58061\n",
+ "2025-02-24 19:19:42,702 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connector [CH00001 ACTIVE tcp://localhost:58061] is starting\n",
+ "2025-02-24 19:19:42,703 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00002 127.0.0.1:58938 => 127.0.0.1:58061] is created: PID: 3595105\n",
+ "2025-02-24 19:19:42,704 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00005 127.0.0.1:58061 <= 127.0.0.1:58938] is created: PID: 3595083\n",
+ "2025-02-24 19:19:42,714 - CoreCell - INFO - site-2.simulate_job: created backbone external connector to tcp://localhost:58061\n",
+ "2025-02-24 19:19:42,714 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connector [CH00001 ACTIVE tcp://localhost:58061] is starting\n",
+ "2025-02-24 19:19:42,715 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00002 127.0.0.1:58944 => 127.0.0.1:58061] is created: PID: 3595106\n",
+ "2025-02-24 19:19:42,715 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00006 127.0.0.1:58061 <= 127.0.0.1:58944] is created: PID: 3595083\n",
+ "2025-02-24 19:19:44,562 - AuxRunner - INFO - registered aux handler for topic __end_run__\n",
+ "2025-02-24 19:19:44,562 - AuxRunner - INFO - registered aux handler for topic __end_run__\n",
+ "2025-02-24 19:19:44,562 - AuxRunner - INFO - registered aux handler for topic __do_task__\n",
+ "2025-02-24 19:19:44,562 - AuxRunner - INFO - registered aux handler for topic __do_task__\n",
+ "2025-02-24 19:19:44,562 - Cell - INFO - Register blob CB for channel='aux_communication', topic='*'\n",
+ "2025-02-24 19:19:44,562 - Cell - INFO - Register blob CB for channel='aux_communication', topic='*'\n",
+ "2025-02-24 19:19:45,068 - Cell - INFO - broadcast: channel='aux_communication', topic='__sync_runner__', targets=['server.simulate_job'], timeout=2.0\n",
+ "2025-02-24 19:19:45,069 - Cell - INFO - broadcast: channel='aux_communication', topic='__sync_runner__', targets=['server.simulate_job'], timeout=2.0\n",
+ "2025-02-24 19:19:45,081 - ClientRunner - INFO - [identity=site-2, run=simulate_job]: synced to Server Runner in 0.5122072696685791 seconds\n",
+ "2025-02-24 19:19:45,081 - AuxRunner - INFO - registered aux handler for topic RM.RELIABLE_REQUEST\n",
+ "2025-02-24 19:19:45,081 - AuxRunner - INFO - registered aux handler for topic RM.RELIABLE_REPLY\n",
+ "2025-02-24 19:19:45,081 - ReliableMessage - INFO - enabled reliable message: max_request_workers=20 query_interval=2.0\n",
+ "2025-02-24 19:19:45,082 - TaskScriptRunner - INFO - start task run() with full path: /tmp/nvflare/cifar10_fedavg_dp/site-2/simulate_job/app_site-2/custom/src/cifar10_fl.py\n",
+ "2025-02-24 19:19:45,083 - ClientRunner - INFO - [identity=site-1, run=simulate_job]: synced to Server Runner in 0.5150198936462402 seconds\n",
+ "2025-02-24 19:19:45,083 - AuxRunner - INFO - registered aux handler for topic RM.RELIABLE_REQUEST\n",
+ "2025-02-24 19:19:45,083 - AuxRunner - INFO - registered aux handler for topic RM.RELIABLE_REPLY\n",
+ "2025-02-24 19:19:45,084 - AuxRunner - INFO - registered aux handler for topic fed.event\n",
+ "2025-02-24 19:19:45,084 - ReliableMessage - INFO - enabled reliable message: max_request_workers=20 query_interval=2.0\n",
+ "2025-02-24 19:19:45,084 - ClientRunner - INFO - [identity=site-2, run=simulate_job]: client runner started\n",
+ "2025-02-24 19:19:45,084 - ClientTaskWorker - INFO - Initialize ClientRunner for client: site-2\n",
+ "2025-02-24 19:19:45,085 - TaskScriptRunner - INFO - start task run() with full path: /tmp/nvflare/cifar10_fedavg_dp/site-1/simulate_job/app_site-1/custom/src/cifar10_fl.py\n",
+ "2025-02-24 19:19:45,087 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: assigned task to client site-2: name=train, id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be\n",
+ "2025-02-24 19:19:45,088 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: sent task assignment to client. client_name:site-2 task_id:ed30f55b-2274-4ca3-b5c4-d4a28b8c64be\n",
+ "2025-02-24 19:19:45,088 - GetTaskCommand - INFO - return task to client. client_name: site-2 task_name: train task_id: ed30f55b-2274-4ca3-b5c4-d4a28b8c64be sharable_header_task_id: ed30f55b-2274-4ca3-b5c4-d4a28b8c64be\n",
+ "2025-02-24 19:19:45,090 - AuxRunner - INFO - registered aux handler for topic fed.event\n",
+ "2025-02-24 19:19:45,091 - ClientRunner - INFO - [identity=site-1, run=simulate_job]: client runner started\n",
+ "2025-02-24 19:19:45,091 - ClientTaskWorker - INFO - Initialize ClientRunner for client: site-1\n",
+ "2025-02-24 19:19:45,097 - Communicator - INFO - Received from simulator_server server. getTask: train size: 251.5KB (251471 Bytes) time: 0.012291 seconds\n",
+ "2025-02-24 19:19:45,097 - FederatedClient - INFO - pull_task completed. Task name:train Status:True \n",
+ "2025-02-24 19:19:45,097 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: got task assignment: name=train, id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be\n",
+ "2025-02-24 19:19:45,097 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: assigned task to client site-1: name=train, id=5cc519aa-f084-422d-8df8-df6faba3c0d0\n",
+ "2025-02-24 19:19:45,097 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: sent task assignment to client. client_name:site-1 task_id:5cc519aa-f084-422d-8df8-df6faba3c0d0\n",
+ "2025-02-24 19:19:45,098 - GetTaskCommand - INFO - return task to client. client_name: site-1 task_name: train task_id: 5cc519aa-f084-422d-8df8-df6faba3c0d0 sharable_header_task_id: 5cc519aa-f084-422d-8df8-df6faba3c0d0\n",
+ "2025-02-24 19:19:45,098 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: invoking task executor PTInProcessClientAPIExecutor\n",
+ "2025-02-24 19:19:45,098 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: execute for task (train)\n",
+ "2025-02-24 19:19:45,098 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: send data to peer\n",
+ "2025-02-24 19:19:45,098 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: sending payload to peer\n",
+ "2025-02-24 19:19:45,103 - Communicator - INFO - Received from simulator_server server. getTask: train size: 251.5KB (251471 Bytes) time: 0.011704 seconds\n",
+ "2025-02-24 19:19:45,103 - FederatedClient - INFO - pull_task completed. Task name:train Status:True \n",
+ "2025-02-24 19:19:45,103 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: got task assignment: name=train, id=5cc519aa-f084-422d-8df8-df6faba3c0d0\n",
+ "2025-02-24 19:19:45,103 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: invoking task executor PTInProcessClientAPIExecutor\n",
+ "2025-02-24 19:19:45,104 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: execute for task (train)\n",
+ "2025-02-24 19:19:45,108 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: Waiting for result from peer\n",
+ "2025-02-24 19:19:45,108 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: send data to peer\n",
+ "2025-02-24 19:19:45,108 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: sending payload to peer\n",
+ "2025-02-24 19:19:45,113 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: Waiting for result from peer\n",
+ "2025-02-24 19:19:47,960 - nvflare.app_common.executors.task_script_runner - INFO - current_round=0\n",
+ "2025-02-24 19:19:47,961 - nvflare.app_common.executors.task_script_runner - INFO - current_round=0\n",
+ "2025-02-24 19:19:58,119 - nvflare.app_common.executors.task_script_runner - INFO - [1, 2000] loss: 2.258\n",
+ "2025-02-24 19:19:58,175 - nvflare.app_common.executors.task_script_runner - INFO - [1, 2000] loss: 2.259\n",
+ "2025-02-24 19:20:07,203 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 1.975\n",
+ "2025-02-24 19:20:07,393 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 1.956\n",
+ "2025-02-24 19:20:16,753 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 1.753\n",
+ "2025-02-24 19:20:16,873 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 1.744\n",
+ "2025-02-24 19:20:26,358 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 1.619\n",
+ "2025-02-24 19:20:26,475 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 1.631\n",
+ "2025-02-24 19:20:35,515 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 1.545\n",
+ "2025-02-24 19:20:35,625 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 1.548\n",
+ "2025-02-24 19:20:44,808 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 1.488\n",
+ "2025-02-24 19:20:44,938 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 1.499\n",
+ "2025-02-24 19:20:56,777 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 1.415\n",
+ "2025-02-24 19:20:56,804 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 1.420\n",
+ "2025-02-24 19:21:06,159 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 1.411\n",
+ "2025-02-24 19:21:06,243 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 1.375\n",
+ "2025-02-24 19:21:15,358 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 1.375\n",
+ "2025-02-24 19:21:15,433 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 1.348\n",
+ "2025-02-24 19:21:24,720 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 1.364\n",
+ "2025-02-24 19:21:24,760 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 1.347\n",
+ "2025-02-24 19:21:34,266 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 1.347\n",
+ "2025-02-24 19:21:34,313 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 1.329\n",
+ "2025-02-24 19:21:43,782 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 1.305\n",
+ "2025-02-24 19:21:43,891 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 1.301\n",
+ "2025-02-24 19:21:46,176 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:21:46,273 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:21:54,602 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 10 %\n",
+ "2025-02-24 19:21:54,606 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:21:54,611 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 10 %\n",
+ "2025-02-24 19:21:54,614 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:21:54,710 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: Delta_w: Max abs: 3.869296415359713e-05, Min abs: 9.187448024583489e-11, Median abs: 1.290611862714286e-06.\n",
+ "2025-02-24 19:21:54,710 - SVTPrivacy - INFO - total params: 62006, epsilon: 0.1, perparam budget 1.6126431220770844e-05, threshold tau: 1e-06 + f(eps_1) = 0.00024251645252032099, clip gamma: 1e-05\n",
+ "2025-02-24 19:21:54,714 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: selected 29426 responses, requested 6201.0\n",
+ "2025-02-24 19:21:54,715 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: noise max: 0.0018257939708887336, median 0.00013786521784798058\n",
+ "2025-02-24 19:21:54,724 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: finished processing task\n",
+ "2025-02-24 19:21:54,725 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: try #1: sending task result to server\n",
+ "2025-02-24 19:21:54,725 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: checking task ...\n",
+ "2025-02-24 19:21:54,725 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:21:54,729 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: start to send task result to server\n",
+ "2025-02-24 19:21:54,729 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:21:54,733 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job]: got result from client site-2 for task: name=train, id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be\n",
+ "2025-02-24 19:21:54,776 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: Delta_w: Max abs: 4.0250157326227054e-05, Min abs: 2.8359222956075847e-11, Median abs: 1.296140908380039e-06.\n",
+ "2025-02-24 19:21:54,777 - SVTPrivacy - INFO - total params: 62006, epsilon: 0.1, perparam budget 1.6126431220770844e-05, threshold tau: 1e-06 + f(eps_1) = 0.00022680394985402226, clip gamma: 1e-05\n",
+ "2025-02-24 19:21:54,782 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: selected 29541 responses, requested 6201.0\n",
+ "2025-02-24 19:21:54,784 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: noise max: 0.0017226140842528442, median 0.00014242334695973427\n",
+ "2025-02-24 19:21:54,793 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: finished processing task\n",
+ "2025-02-24 19:21:54,794 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: try #1: sending task result to server\n",
+ "2025-02-24 19:21:54,794 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: checking task ...\n",
+ "2025-02-24 19:21:54,794 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:21:54,820 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: finished processing client result by controller\n",
+ "2025-02-24 19:21:54,821 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-2 task_id:ed30f55b-2274-4ca3-b5c4-d4a28b8c64be\n",
+ "2025-02-24 19:21:54,822 - Communicator - INFO - SubmitUpdate size: 251.5KB (251476 Bytes). time: 0.093112 seconds\n",
+ "2025-02-24 19:21:54,822 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=ed30f55b-2274-4ca3-b5c4-d4a28b8c64be]: task result sent to server\n",
+ "2025-02-24 19:21:54,823 - ClientTaskWorker - INFO - Finished one task run for client: site-2 interval: 2 task_processed: True\n",
+ "2025-02-24 19:21:54,825 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: start to send task result to server\n",
+ "2025-02-24 19:21:54,825 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:21:54,829 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job]: got result from client site-1 for task: name=train, id=5cc519aa-f084-422d-8df8-df6faba3c0d0\n",
+ "2025-02-24 19:21:54,900 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: finished processing client result by controller\n",
+ "2025-02-24 19:21:54,900 - WFCommServer - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: task train exit with status TaskCompletionStatus.OK\n",
+ "2025-02-24 19:21:54,900 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-1 task_id:5cc519aa-f084-422d-8df8-df6faba3c0d0\n",
+ "2025-02-24 19:21:54,902 - Communicator - INFO - SubmitUpdate size: 251.5KB (251476 Bytes). time: 0.077421 seconds\n",
+ "2025-02-24 19:21:54,903 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: task result sent to server\n",
+ "2025-02-24 19:21:54,903 - ClientTaskWorker - INFO - Finished one task run for client: site-1 interval: 2 task_processed: True\n",
+ "2025-02-24 19:21:55,021 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: aggregating 2 update(s) at round 0\n",
+ "2025-02-24 19:21:55,023 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: Start persist model on server.\n",
+ "2025-02-24 19:21:55,026 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: End persist model on server.\n",
+ "2025-02-24 19:21:55,027 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: Round 1 started.\n",
+ "2025-02-24 19:21:55,027 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: Sampled clients: ['site-1', 'site-2']\n",
+ "2025-02-24 19:21:55,027 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: Sending task train to ['site-1', 'site-2']\n",
+ "2025-02-24 19:21:55,027 - WFCommServer - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=5cc519aa-f084-422d-8df8-df6faba3c0d0]: scheduled task train\n",
+ "2025-02-24 19:21:56,828 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: assigned task to client site-2: name=train, id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4\n",
+ "2025-02-24 19:21:56,828 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: sent task assignment to client. client_name:site-2 task_id:a66879bc-3113-45d2-ac2f-dedfdc2de4a4\n",
+ "2025-02-24 19:21:56,829 - GetTaskCommand - INFO - return task to client. client_name: site-2 task_name: train task_id: a66879bc-3113-45d2-ac2f-dedfdc2de4a4 sharable_header_task_id: a66879bc-3113-45d2-ac2f-dedfdc2de4a4\n",
+ "2025-02-24 19:21:56,833 - Communicator - INFO - Received from simulator_server server. getTask: train size: 251.5KB (251536 Bytes) time: 0.008358 seconds\n",
+ "2025-02-24 19:21:56,833 - FederatedClient - INFO - pull_task completed. Task name:train Status:True \n",
+ "2025-02-24 19:21:56,833 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: got task assignment: name=train, id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4\n",
+ "2025-02-24 19:21:56,833 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: invoking task executor PTInProcessClientAPIExecutor\n",
+ "2025-02-24 19:21:56,833 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: execute for task (train)\n",
+ "2025-02-24 19:21:56,834 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: send data to peer\n",
+ "2025-02-24 19:21:56,834 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: sending payload to peer\n",
+ "2025-02-24 19:21:56,834 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: Waiting for result from peer\n",
+ "2025-02-24 19:21:56,907 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: assigned task to client site-1: name=train, id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788\n",
+ "2025-02-24 19:21:56,908 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: sent task assignment to client. client_name:site-1 task_id:d6f5a720-8f8b-4dee-9d6d-41c9e24d0788\n",
+ "2025-02-24 19:21:56,908 - GetTaskCommand - INFO - return task to client. client_name: site-1 task_name: train task_id: d6f5a720-8f8b-4dee-9d6d-41c9e24d0788 sharable_header_task_id: d6f5a720-8f8b-4dee-9d6d-41c9e24d0788\n",
+ "2025-02-24 19:21:56,912 - Communicator - INFO - Received from simulator_server server. getTask: train size: 251.5KB (251536 Bytes) time: 0.007530 seconds\n",
+ "2025-02-24 19:21:56,912 - FederatedClient - INFO - pull_task completed. Task name:train Status:True \n",
+ "2025-02-24 19:21:56,912 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: got task assignment: name=train, id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788\n",
+ "2025-02-24 19:21:56,912 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: invoking task executor PTInProcessClientAPIExecutor\n",
+ "2025-02-24 19:21:56,912 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: execute for task (train)\n",
+ "2025-02-24 19:21:56,913 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: send data to peer\n",
+ "2025-02-24 19:21:56,913 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: sending payload to peer\n",
+ "2025-02-24 19:21:56,913 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: Waiting for result from peer\n",
+ "2025-02-24 19:21:57,107 - nvflare.app_common.executors.task_script_runner - INFO - current_round=1\n",
+ "2025-02-24 19:21:57,116 - nvflare.app_common.executors.task_script_runner - INFO - current_round=1\n",
+ "2025-02-24 19:22:06,704 - nvflare.app_common.executors.task_script_runner - INFO - [1, 2000] loss: 2.196\n",
+ "2025-02-24 19:22:06,740 - nvflare.app_common.executors.task_script_runner - INFO - [1, 2000] loss: 2.180\n",
+ "2025-02-24 19:22:16,125 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 1.848\n",
+ "2025-02-24 19:22:16,276 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 1.866\n",
+ "2025-02-24 19:22:26,076 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 1.652\n",
+ "2025-02-24 19:22:26,087 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 1.684\n",
+ "2025-02-24 19:22:35,607 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 1.560\n",
+ "2025-02-24 19:22:35,631 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 1.584\n",
+ "2025-02-24 19:22:44,953 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 1.538\n",
+ "2025-02-24 19:22:45,360 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 1.523\n",
+ "2025-02-24 19:22:54,617 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 1.469\n",
+ "2025-02-24 19:22:55,083 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 1.498\n",
+ "2025-02-24 19:23:07,035 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 1.420\n",
+ "2025-02-24 19:23:07,415 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 1.424\n",
+ "2025-02-24 19:23:16,909 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 1.377\n",
+ "2025-02-24 19:23:17,276 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 1.393\n",
+ "2025-02-24 19:23:26,507 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 1.355\n",
+ "2025-02-24 19:23:26,831 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 1.376\n",
+ "2025-02-24 19:23:36,233 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 1.354\n",
+ "2025-02-24 19:23:36,344 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 1.378\n",
+ "2025-02-24 19:23:45,786 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 1.325\n",
+ "2025-02-24 19:23:45,979 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 1.300\n",
+ "2025-02-24 19:23:55,323 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 1.334\n",
+ "2025-02-24 19:23:55,523 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 1.312\n",
+ "2025-02-24 19:23:57,885 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:23:58,086 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:24:06,285 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 10 %\n",
+ "2025-02-24 19:24:06,289 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:24:06,390 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 10 %\n",
+ "2025-02-24 19:24:06,393 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:24:06,430 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: Delta_w: Max abs: 3.798707984969951e-05, Min abs: 4.402827928629005e-12, Median abs: 7.905571237643017e-07.\n",
+ "2025-02-24 19:24:06,431 - SVTPrivacy - INFO - total params: 62006, epsilon: 0.1, perparam budget 1.6126431220770844e-05, threshold tau: 1e-06 + f(eps_1) = -1.7757807953687535e-05, clip gamma: 1e-05\n",
+ "2025-02-24 19:24:06,434 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: selected 31257 responses, requested 6201.0\n",
+ "2025-02-24 19:24:06,436 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: noise max: 0.0017802754564614869, median 0.0001357158707154192\n",
+ "2025-02-24 19:24:06,445 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: finished processing task\n",
+ "2025-02-24 19:24:06,445 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: try #1: sending task result to server\n",
+ "2025-02-24 19:24:06,445 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: checking task ...\n",
+ "2025-02-24 19:24:06,445 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:24:06,448 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: start to send task result to server\n",
+ "2025-02-24 19:24:06,448 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:24:06,453 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job]: got result from client site-2 for task: name=train, id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4\n",
+ "2025-02-24 19:24:06,455 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: validation metric 10 from client site-2\n",
+ "2025-02-24 19:24:06,504 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: Delta_w: Max abs: 2.9061351597192697e-05, Min abs: 0.0, Median abs: 7.140769753277709e-07.\n",
+ "2025-02-24 19:24:06,504 - SVTPrivacy - INFO - total params: 62006, epsilon: 0.1, perparam budget 1.6126431220770844e-05, threshold tau: 1e-06 + f(eps_1) = 0.00021483682419918906, clip gamma: 1e-05\n",
+ "2025-02-24 19:24:06,509 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: selected 29595 responses, requested 6201.0\n",
+ "2025-02-24 19:24:06,511 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: noise max: 0.0018775447812598568, median 0.00013961098501939689\n",
+ "2025-02-24 19:24:06,520 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: finished processing task\n",
+ "2025-02-24 19:24:06,521 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: try #1: sending task result to server\n",
+ "2025-02-24 19:24:06,521 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: checking task ...\n",
+ "2025-02-24 19:24:06,521 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:24:06,537 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: finished processing client result by controller\n",
+ "2025-02-24 19:24:06,538 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-2 task_id:a66879bc-3113-45d2-ac2f-dedfdc2de4a4\n",
+ "2025-02-24 19:24:06,540 - Communicator - INFO - SubmitUpdate size: 251.5KB (251476 Bytes). time: 0.091972 seconds\n",
+ "2025-02-24 19:24:06,540 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: start to send task result to server\n",
+ "2025-02-24 19:24:06,540 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:24:06,540 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=a66879bc-3113-45d2-ac2f-dedfdc2de4a4]: task result sent to server\n",
+ "2025-02-24 19:24:06,541 - ClientTaskWorker - INFO - Finished one task run for client: site-2 interval: 2 task_processed: True\n",
+ "2025-02-24 19:24:06,544 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job]: got result from client site-1 for task: name=train, id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788\n",
+ "2025-02-24 19:24:06,544 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: validation metric 10 from client site-1\n",
+ "2025-02-24 19:24:06,619 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: finished processing client result by controller\n",
+ "2025-02-24 19:24:06,619 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-1 task_id:d6f5a720-8f8b-4dee-9d6d-41c9e24d0788\n",
+ "2025-02-24 19:24:06,621 - Communicator - INFO - SubmitUpdate size: 251.5KB (251476 Bytes). time: 0.080867 seconds\n",
+ "2025-02-24 19:24:06,622 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: task result sent to server\n",
+ "2025-02-24 19:24:06,622 - ClientTaskWorker - INFO - Finished one task run for client: site-1 interval: 2 task_processed: True\n",
+ "2025-02-24 19:24:06,738 - WFCommServer - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: task train exit with status TaskCompletionStatus.OK\n",
+ "2025-02-24 19:24:06,819 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: new best validation metric at round 1: 10.0\n",
+ "2025-02-24 19:24:06,821 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: aggregating 2 update(s) at round 1\n",
+ "2025-02-24 19:24:06,822 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: Start persist model on server.\n",
+ "2025-02-24 19:24:06,823 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: End persist model on server.\n",
+ "2025-02-24 19:24:06,823 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: Round 2 started.\n",
+ "2025-02-24 19:24:06,823 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: Sampled clients: ['site-1', 'site-2']\n",
+ "2025-02-24 19:24:06,823 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: Sending task train to ['site-1', 'site-2']\n",
+ "2025-02-24 19:24:06,823 - WFCommServer - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d6f5a720-8f8b-4dee-9d6d-41c9e24d0788]: scheduled task train\n",
+ "2025-02-24 19:24:08,545 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: assigned task to client site-2: name=train, id=e0094842-dfa8-445c-b01d-93689c9467ee\n",
+ "2025-02-24 19:24:08,545 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: sent task assignment to client. client_name:site-2 task_id:e0094842-dfa8-445c-b01d-93689c9467ee\n",
+ "2025-02-24 19:24:08,546 - GetTaskCommand - INFO - return task to client. client_name: site-2 task_name: train task_id: e0094842-dfa8-445c-b01d-93689c9467ee sharable_header_task_id: e0094842-dfa8-445c-b01d-93689c9467ee\n",
+ "2025-02-24 19:24:08,550 - Communicator - INFO - Received from simulator_server server. getTask: train size: 251.5KB (251536 Bytes) time: 0.008062 seconds\n",
+ "2025-02-24 19:24:08,550 - FederatedClient - INFO - pull_task completed. Task name:train Status:True \n",
+ "2025-02-24 19:24:08,550 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: got task assignment: name=train, id=e0094842-dfa8-445c-b01d-93689c9467ee\n",
+ "2025-02-24 19:24:08,550 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: invoking task executor PTInProcessClientAPIExecutor\n",
+ "2025-02-24 19:24:08,550 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: execute for task (train)\n",
+ "2025-02-24 19:24:08,551 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: send data to peer\n",
+ "2025-02-24 19:24:08,551 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: sending payload to peer\n",
+ "2025-02-24 19:24:08,551 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: Waiting for result from peer\n",
+ "2025-02-24 19:24:08,627 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: assigned task to client site-1: name=train, id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e\n",
+ "2025-02-24 19:24:08,628 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: sent task assignment to client. client_name:site-1 task_id:99af09e9-6c45-4f0e-9130-4fba8be4ce1e\n",
+ "2025-02-24 19:24:08,628 - GetTaskCommand - INFO - return task to client. client_name: site-1 task_name: train task_id: 99af09e9-6c45-4f0e-9130-4fba8be4ce1e sharable_header_task_id: 99af09e9-6c45-4f0e-9130-4fba8be4ce1e\n",
+ "2025-02-24 19:24:08,633 - Communicator - INFO - Received from simulator_server server. getTask: train size: 251.5KB (251536 Bytes) time: 0.009301 seconds\n",
+ "2025-02-24 19:24:08,633 - FederatedClient - INFO - pull_task completed. Task name:train Status:True \n",
+ "2025-02-24 19:24:08,633 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: got task assignment: name=train, id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e\n",
+ "2025-02-24 19:24:08,633 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: invoking task executor PTInProcessClientAPIExecutor\n",
+ "2025-02-24 19:24:08,633 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: execute for task (train)\n",
+ "2025-02-24 19:24:08,633 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: send data to peer\n",
+ "2025-02-24 19:24:08,634 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: sending payload to peer\n",
+ "2025-02-24 19:24:08,634 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: Waiting for result from peer\n",
+ "2025-02-24 19:24:08,791 - nvflare.app_common.executors.task_script_runner - INFO - current_round=2\n",
+ "2025-02-24 19:24:08,894 - nvflare.app_common.executors.task_script_runner - INFO - current_round=2\n",
+ "2025-02-24 19:24:18,554 - nvflare.app_common.executors.task_script_runner - INFO - [1, 2000] loss: 2.166\n",
+ "2025-02-24 19:24:18,743 - nvflare.app_common.executors.task_script_runner - INFO - [1, 2000] loss: 2.155\n",
+ "2025-02-24 19:24:28,317 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 1.831\n",
+ "2025-02-24 19:24:28,549 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 1.830\n",
+ "2025-02-24 19:24:37,887 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 1.685\n",
+ "2025-02-24 19:24:38,022 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 1.662\n",
+ "2025-02-24 19:24:47,461 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 1.586\n",
+ "2025-02-24 19:24:47,630 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 1.589\n",
+ "2025-02-24 19:24:56,974 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 1.522\n",
+ "2025-02-24 19:24:57,091 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 1.542\n",
+ "2025-02-24 19:25:06,615 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 1.489\n",
+ "2025-02-24 19:25:06,840 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 1.490\n",
+ "2025-02-24 19:25:18,960 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 1.440\n",
+ "2025-02-24 19:25:19,016 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 1.414\n",
+ "2025-02-24 19:25:28,706 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 1.363\n",
+ "2025-02-24 19:25:28,833 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 1.399\n",
+ "2025-02-24 19:25:38,204 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 1.357\n",
+ "2025-02-24 19:25:38,346 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 1.358\n",
+ "2025-02-24 19:25:47,652 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 1.363\n",
+ "2025-02-24 19:25:47,757 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 1.332\n",
+ "2025-02-24 19:25:57,256 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 1.317\n",
+ "2025-02-24 19:25:57,276 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 1.337\n",
+ "2025-02-24 19:26:06,886 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 1.313\n",
+ "2025-02-24 19:26:06,931 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 1.288\n",
+ "2025-02-24 19:26:09,406 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:26:09,432 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:26:17,713 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 10 %\n",
+ "2025-02-24 19:26:17,717 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:26:17,749 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 10 %\n",
+ "2025-02-24 19:26:17,752 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:26:18,158 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: Delta_w: Max abs: 5.610540392808616e-05, Min abs: 4.645476711639951e-11, Median abs: 8.403736728723743e-07.\n",
+ "2025-02-24 19:26:18,159 - SVTPrivacy - INFO - total params: 62006, epsilon: 0.1, perparam budget 1.6126431220770844e-05, threshold tau: 1e-06 + f(eps_1) = 0.0002333115456710719, clip gamma: 1e-05\n",
+ "2025-02-24 19:26:18,164 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: selected 29319 responses, requested 6201.0\n",
+ "2025-02-24 19:26:18,165 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: noise max: 0.001532915313546089, median 0.0001411072528242095\n",
+ "2025-02-24 19:26:18,174 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: finished processing task\n",
+ "2025-02-24 19:26:18,174 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: try #1: sending task result to server\n",
+ "2025-02-24 19:26:18,175 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: checking task ...\n",
+ "2025-02-24 19:26:18,175 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:26:18,179 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: start to send task result to server\n",
+ "2025-02-24 19:26:18,179 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:26:18,183 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job]: got result from client site-2 for task: name=train, id=e0094842-dfa8-445c-b01d-93689c9467ee\n",
+ "2025-02-24 19:26:18,183 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: validation metric 10 from client site-2\n",
+ "2025-02-24 19:26:18,236 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: Delta_w: Max abs: 5.931659688940272e-05, Min abs: 1.3302400630674227e-12, Median abs: 8.146419077093014e-07.\n",
+ "2025-02-24 19:26:18,237 - SVTPrivacy - INFO - total params: 62006, epsilon: 0.1, perparam budget 1.6126431220770844e-05, threshold tau: 1e-06 + f(eps_1) = 1.393511119915548e-05, clip gamma: 1e-05\n",
+ "2025-02-24 19:26:18,245 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: selected 30996 responses, requested 6201.0\n",
+ "2025-02-24 19:26:18,247 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: noise max: 0.0019125323654773589, median 0.00013696955307286124\n",
+ "2025-02-24 19:26:18,256 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: finished processing task\n",
+ "2025-02-24 19:26:18,256 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: try #1: sending task result to server\n",
+ "2025-02-24 19:26:18,256 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: checking task ...\n",
+ "2025-02-24 19:26:18,256 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:26:18,261 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: finished processing client result by controller\n",
+ "2025-02-24 19:26:18,263 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-2 task_id:e0094842-dfa8-445c-b01d-93689c9467ee\n",
+ "2025-02-24 19:26:18,264 - Communicator - INFO - SubmitUpdate size: 251.5KB (251476 Bytes). time: 0.084803 seconds\n",
+ "2025-02-24 19:26:18,264 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=e0094842-dfa8-445c-b01d-93689c9467ee]: task result sent to server\n",
+ "2025-02-24 19:26:18,264 - ClientTaskWorker - INFO - Finished one task run for client: site-2 interval: 2 task_processed: True\n",
+ "2025-02-24 19:26:18,266 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: start to send task result to server\n",
+ "2025-02-24 19:26:18,266 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:26:18,269 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job]: got result from client site-1 for task: name=train, id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e\n",
+ "2025-02-24 19:26:18,269 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: validation metric 10 from client site-1\n",
+ "2025-02-24 19:26:18,339 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: finished processing client result by controller\n",
+ "2025-02-24 19:26:18,339 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-1 task_id:99af09e9-6c45-4f0e-9130-4fba8be4ce1e\n",
+ "2025-02-24 19:26:18,341 - Communicator - INFO - SubmitUpdate size: 251.5KB (251476 Bytes). time: 0.075582 seconds\n",
+ "2025-02-24 19:26:18,342 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: task result sent to server\n",
+ "2025-02-24 19:26:18,342 - ClientTaskWorker - INFO - Finished one task run for client: site-1 interval: 2 task_processed: True\n",
+ "2025-02-24 19:26:18,346 - WFCommServer - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: task train exit with status TaskCompletionStatus.OK\n",
+ "2025-02-24 19:26:18,379 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: aggregating 2 update(s) at round 2\n",
+ "2025-02-24 19:26:18,379 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: Start persist model on server.\n",
+ "2025-02-24 19:26:18,381 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: End persist model on server.\n",
+ "2025-02-24 19:26:18,381 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: Round 3 started.\n",
+ "2025-02-24 19:26:18,381 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: Sampled clients: ['site-1', 'site-2']\n",
+ "2025-02-24 19:26:18,381 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: Sending task train to ['site-1', 'site-2']\n",
+ "2025-02-24 19:26:18,382 - WFCommServer - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=99af09e9-6c45-4f0e-9130-4fba8be4ce1e]: scheduled task train\n",
+ "2025-02-24 19:26:20,269 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: assigned task to client site-2: name=train, id=3da79990-b261-4b9e-9f02-f45ab051d21f\n",
+ "2025-02-24 19:26:20,269 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: sent task assignment to client. client_name:site-2 task_id:3da79990-b261-4b9e-9f02-f45ab051d21f\n",
+ "2025-02-24 19:26:20,269 - GetTaskCommand - INFO - return task to client. client_name: site-2 task_name: train task_id: 3da79990-b261-4b9e-9f02-f45ab051d21f sharable_header_task_id: 3da79990-b261-4b9e-9f02-f45ab051d21f\n",
+ "2025-02-24 19:26:20,273 - Communicator - INFO - Received from simulator_server server. getTask: train size: 251.5KB (251536 Bytes) time: 0.007312 seconds\n",
+ "2025-02-24 19:26:20,273 - FederatedClient - INFO - pull_task completed. Task name:train Status:True \n",
+ "2025-02-24 19:26:20,273 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: got task assignment: name=train, id=3da79990-b261-4b9e-9f02-f45ab051d21f\n",
+ "2025-02-24 19:26:20,273 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: invoking task executor PTInProcessClientAPIExecutor\n",
+ "2025-02-24 19:26:20,273 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: execute for task (train)\n",
+ "2025-02-24 19:26:20,274 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: send data to peer\n",
+ "2025-02-24 19:26:20,274 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: sending payload to peer\n",
+ "2025-02-24 19:26:20,274 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: Waiting for result from peer\n",
+ "2025-02-24 19:26:20,347 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: assigned task to client site-1: name=train, id=d88b56ae-70ff-4a71-8c00-81af4650b178\n",
+ "2025-02-24 19:26:20,348 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: sent task assignment to client. client_name:site-1 task_id:d88b56ae-70ff-4a71-8c00-81af4650b178\n",
+ "2025-02-24 19:26:20,348 - GetTaskCommand - INFO - return task to client. client_name: site-1 task_name: train task_id: d88b56ae-70ff-4a71-8c00-81af4650b178 sharable_header_task_id: d88b56ae-70ff-4a71-8c00-81af4650b178\n",
+ "2025-02-24 19:26:20,353 - Communicator - INFO - Received from simulator_server server. getTask: train size: 251.5KB (251536 Bytes) time: 0.010121 seconds\n",
+ "2025-02-24 19:26:20,353 - FederatedClient - INFO - pull_task completed. Task name:train Status:True \n",
+ "2025-02-24 19:26:20,354 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: got task assignment: name=train, id=d88b56ae-70ff-4a71-8c00-81af4650b178\n",
+ "2025-02-24 19:26:20,354 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: invoking task executor PTInProcessClientAPIExecutor\n",
+ "2025-02-24 19:26:20,354 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: execute for task (train)\n",
+ "2025-02-24 19:26:20,354 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: send data to peer\n",
+ "2025-02-24 19:26:20,354 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: sending payload to peer\n",
+ "2025-02-24 19:26:20,354 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: Waiting for result from peer\n",
+ "2025-02-24 19:26:20,718 - nvflare.app_common.executors.task_script_runner - INFO - current_round=3\n",
+ "2025-02-24 19:26:20,756 - nvflare.app_common.executors.task_script_runner - INFO - current_round=3\n",
+ "2025-02-24 19:26:30,547 - nvflare.app_common.executors.task_script_runner - INFO - [1, 2000] loss: 2.180\n",
+ "2025-02-24 19:26:30,683 - nvflare.app_common.executors.task_script_runner - INFO - [1, 2000] loss: 2.175\n",
+ "2025-02-24 19:26:40,176 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 1.818\n",
+ "2025-02-24 19:26:40,509 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 1.807\n",
+ "2025-02-24 19:26:49,978 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 1.634\n",
+ "2025-02-24 19:26:50,178 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 1.647\n",
+ "2025-02-24 19:26:59,743 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 1.551\n",
+ "2025-02-24 19:26:59,861 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 1.569\n",
+ "2025-02-24 19:27:09,296 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 1.486\n",
+ "2025-02-24 19:27:09,353 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 1.520\n",
+ "2025-02-24 19:27:18,669 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 1.447\n",
+ "2025-02-24 19:27:18,942 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 1.463\n",
+ "2025-02-24 19:27:30,848 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 1.364\n",
+ "2025-02-24 19:27:31,048 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 1.400\n",
+ "2025-02-24 19:27:40,556 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 1.369\n",
+ "2025-02-24 19:27:40,576 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 1.366\n",
+ "2025-02-24 19:27:49,848 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 1.341\n",
+ "2025-02-24 19:27:50,065 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 1.363\n",
+ "2025-02-24 19:27:59,320 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 1.308\n",
+ "2025-02-24 19:27:59,595 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 1.341\n",
+ "2025-02-24 19:28:08,952 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 1.306\n",
+ "2025-02-24 19:28:08,964 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 1.318\n",
+ "2025-02-24 19:28:18,356 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 1.278\n",
+ "2025-02-24 19:28:18,395 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 1.272\n",
+ "2025-02-24 19:28:20,817 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:28:20,829 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:28:29,226 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 11 %\n",
+ "2025-02-24 19:28:29,228 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 11 %\n",
+ "2025-02-24 19:28:29,230 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:28:29,231 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:28:29,428 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: Delta_w: Max abs: 3.0773393518757075e-05, Min abs: 9.738089674915518e-12, Median abs: 8.66032792146143e-07.\n",
+ "2025-02-24 19:28:29,428 - SVTPrivacy - INFO - total params: 62006, epsilon: 0.1, perparam budget 1.6126431220770844e-05, threshold tau: 1e-06 + f(eps_1) = -0.00017374283661396686, clip gamma: 1e-05\n",
+ "2025-02-24 19:28:29,437 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: selected 32077 responses, requested 6201.0\n",
+ "2025-02-24 19:28:29,439 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: noise max: 0.001641621628439241, median 0.00013925051066147063\n",
+ "2025-02-24 19:28:29,447 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: finished processing task\n",
+ "2025-02-24 19:28:29,447 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: try #1: sending task result to server\n",
+ "2025-02-24 19:28:29,447 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: checking task ...\n",
+ "2025-02-24 19:28:29,448 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:28:29,450 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: start to send task result to server\n",
+ "2025-02-24 19:28:29,450 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:28:29,453 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job]: got result from client site-2 for task: name=train, id=3da79990-b261-4b9e-9f02-f45ab051d21f\n",
+ "2025-02-24 19:28:29,454 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: validation metric 11 from client site-2\n",
+ "2025-02-24 19:28:29,490 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: Delta_w: Max abs: 2.9308683224371634e-05, Min abs: 6.0259371277571194e-12, Median abs: 7.825964303265209e-07.\n",
+ "2025-02-24 19:28:29,490 - SVTPrivacy - INFO - total params: 62006, epsilon: 0.1, perparam budget 1.6126431220770844e-05, threshold tau: 1e-06 + f(eps_1) = -0.0003496977515021054, clip gamma: 1e-05\n",
+ "2025-02-24 19:28:29,495 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: selected 33222 responses, requested 6201.0\n",
+ "2025-02-24 19:28:29,498 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: noise max: 0.002178795869111997, median 0.00013565106286838233\n",
+ "2025-02-24 19:28:29,507 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: finished processing task\n",
+ "2025-02-24 19:28:29,507 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: try #1: sending task result to server\n",
+ "2025-02-24 19:28:29,507 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: checking task ...\n",
+ "2025-02-24 19:28:29,507 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:28:29,535 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: finished processing client result by controller\n",
+ "2025-02-24 19:28:29,536 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-2 task_id:3da79990-b261-4b9e-9f02-f45ab051d21f\n",
+ "2025-02-24 19:28:29,538 - Communicator - INFO - SubmitUpdate size: 251.5KB (251476 Bytes). time: 0.087110 seconds\n",
+ "2025-02-24 19:28:29,538 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: start to send task result to server\n",
+ "2025-02-24 19:28:29,538 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=3da79990-b261-4b9e-9f02-f45ab051d21f]: task result sent to server\n",
+ "2025-02-24 19:28:29,538 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:28:29,538 - ClientTaskWorker - INFO - Finished one task run for client: site-2 interval: 2 task_processed: True\n",
+ "2025-02-24 19:28:29,544 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job]: got result from client site-1 for task: name=train, id=d88b56ae-70ff-4a71-8c00-81af4650b178\n",
+ "2025-02-24 19:28:29,545 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: validation metric 11 from client site-1\n",
+ "2025-02-24 19:28:29,616 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: finished processing client result by controller\n",
+ "2025-02-24 19:28:29,616 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-1 task_id:d88b56ae-70ff-4a71-8c00-81af4650b178\n",
+ "2025-02-24 19:28:29,618 - Communicator - INFO - SubmitUpdate size: 251.5KB (251476 Bytes). time: 0.079753 seconds\n",
+ "2025-02-24 19:28:29,618 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: task result sent to server\n",
+ "2025-02-24 19:28:29,619 - ClientTaskWorker - INFO - Finished one task run for client: site-1 interval: 2 task_processed: True\n",
+ "2025-02-24 19:28:29,740 - WFCommServer - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: task train exit with status TaskCompletionStatus.OK\n",
+ "2025-02-24 19:28:29,940 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: new best validation metric at round 3: 11.0\n",
+ "2025-02-24 19:28:29,942 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: aggregating 2 update(s) at round 3\n",
+ "2025-02-24 19:28:29,943 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: Start persist model on server.\n",
+ "2025-02-24 19:28:29,945 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: End persist model on server.\n",
+ "2025-02-24 19:28:29,945 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: Round 4 started.\n",
+ "2025-02-24 19:28:29,945 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: Sampled clients: ['site-1', 'site-2']\n",
+ "2025-02-24 19:28:29,945 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: Sending task train to ['site-1', 'site-2']\n",
+ "2025-02-24 19:28:29,945 - WFCommServer - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=d88b56ae-70ff-4a71-8c00-81af4650b178]: scheduled task train\n",
+ "2025-02-24 19:28:31,544 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: assigned task to client site-2: name=train, id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6\n",
+ "2025-02-24 19:28:31,544 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: sent task assignment to client. client_name:site-2 task_id:0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6\n",
+ "2025-02-24 19:28:31,545 - GetTaskCommand - INFO - return task to client. client_name: site-2 task_name: train task_id: 0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6 sharable_header_task_id: 0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6\n",
+ "2025-02-24 19:28:31,550 - Communicator - INFO - Received from simulator_server server. getTask: train size: 251.5KB (251536 Bytes) time: 0.010322 seconds\n",
+ "2025-02-24 19:28:31,550 - FederatedClient - INFO - pull_task completed. Task name:train Status:True \n",
+ "2025-02-24 19:28:31,550 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: got task assignment: name=train, id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6\n",
+ "2025-02-24 19:28:31,550 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: invoking task executor PTInProcessClientAPIExecutor\n",
+ "2025-02-24 19:28:31,551 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: execute for task (train)\n",
+ "2025-02-24 19:28:31,551 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: send data to peer\n",
+ "2025-02-24 19:28:31,551 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: sending payload to peer\n",
+ "2025-02-24 19:28:31,551 - PTInProcessClientAPIExecutor - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: Waiting for result from peer\n",
+ "2025-02-24 19:28:31,623 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: assigned task to client site-1: name=train, id=bb22caa5-b594-4713-8316-170485b3d18d\n",
+ "2025-02-24 19:28:31,623 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: sent task assignment to client. client_name:site-1 task_id:bb22caa5-b594-4713-8316-170485b3d18d\n",
+ "2025-02-24 19:28:31,623 - GetTaskCommand - INFO - return task to client. client_name: site-1 task_name: train task_id: bb22caa5-b594-4713-8316-170485b3d18d sharable_header_task_id: bb22caa5-b594-4713-8316-170485b3d18d\n",
+ "2025-02-24 19:28:31,628 - Communicator - INFO - Received from simulator_server server. getTask: train size: 251.5KB (251536 Bytes) time: 0.008492 seconds\n",
+ "2025-02-24 19:28:31,628 - FederatedClient - INFO - pull_task completed. Task name:train Status:True \n",
+ "2025-02-24 19:28:31,628 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: got task assignment: name=train, id=bb22caa5-b594-4713-8316-170485b3d18d\n",
+ "2025-02-24 19:28:31,629 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: invoking task executor PTInProcessClientAPIExecutor\n",
+ "2025-02-24 19:28:31,629 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: execute for task (train)\n",
+ "2025-02-24 19:28:31,629 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: send data to peer\n",
+ "2025-02-24 19:28:31,629 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: sending payload to peer\n",
+ "2025-02-24 19:28:31,630 - PTInProcessClientAPIExecutor - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: Waiting for result from peer\n",
+ "2025-02-24 19:28:31,731 - nvflare.app_common.executors.task_script_runner - INFO - current_round=4\n",
+ "2025-02-24 19:28:31,732 - nvflare.app_common.executors.task_script_runner - INFO - current_round=4\n",
+ "2025-02-24 19:28:41,459 - nvflare.app_common.executors.task_script_runner - INFO - [1, 2000] loss: 2.179\n",
+ "2025-02-24 19:28:41,717 - nvflare.app_common.executors.task_script_runner - INFO - [1, 2000] loss: 2.168\n",
+ "2025-02-24 19:28:50,860 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 1.776\n",
+ "2025-02-24 19:28:51,141 - nvflare.app_common.executors.task_script_runner - INFO - [1, 4000] loss: 1.777\n",
+ "2025-02-24 19:29:00,200 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 1.617\n",
+ "2025-02-24 19:29:00,509 - nvflare.app_common.executors.task_script_runner - INFO - [1, 6000] loss: 1.632\n",
+ "2025-02-24 19:29:09,693 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 1.576\n",
+ "2025-02-24 19:29:10,014 - nvflare.app_common.executors.task_script_runner - INFO - [1, 8000] loss: 1.550\n",
+ "2025-02-24 19:29:19,003 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 1.496\n",
+ "2025-02-24 19:29:19,331 - nvflare.app_common.executors.task_script_runner - INFO - [1, 10000] loss: 1.522\n",
+ "2025-02-24 19:29:28,403 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 1.476\n",
+ "2025-02-24 19:29:28,676 - nvflare.app_common.executors.task_script_runner - INFO - [1, 12000] loss: 1.480\n",
+ "2025-02-24 19:29:40,552 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 1.402\n",
+ "2025-02-24 19:29:40,843 - nvflare.app_common.executors.task_script_runner - INFO - [2, 2000] loss: 1.409\n",
+ "2025-02-24 19:29:50,249 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 1.404\n",
+ "2025-02-24 19:29:50,530 - nvflare.app_common.executors.task_script_runner - INFO - [2, 4000] loss: 1.408\n",
+ "2025-02-24 19:29:59,640 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 1.357\n",
+ "2025-02-24 19:30:00,032 - nvflare.app_common.executors.task_script_runner - INFO - [2, 6000] loss: 1.357\n",
+ "2025-02-24 19:30:09,094 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 1.324\n",
+ "2025-02-24 19:30:09,469 - nvflare.app_common.executors.task_script_runner - INFO - [2, 8000] loss: 1.348\n",
+ "2025-02-24 19:30:18,562 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 1.320\n",
+ "2025-02-24 19:30:18,938 - nvflare.app_common.executors.task_script_runner - INFO - [2, 10000] loss: 1.305\n",
+ "2025-02-24 19:30:28,174 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 1.304\n",
+ "2025-02-24 19:30:28,338 - nvflare.app_common.executors.task_script_runner - INFO - [2, 12000] loss: 1.294\n",
+ "2025-02-24 19:30:30,553 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:30:30,788 - nvflare.app_common.executors.task_script_runner - INFO - Finished Training\n",
+ "2025-02-24 19:30:39,142 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 10 %\n",
+ "2025-02-24 19:30:39,146 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:30:39,167 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: Delta_w: Max abs: 3.639836722868495e-05, Min abs: 0.0, Median abs: 8.709176881893654e-07.\n",
+ "2025-02-24 19:30:39,168 - SVTPrivacy - INFO - total params: 62006, epsilon: 0.1, perparam budget 1.6126431220770844e-05, threshold tau: 1e-06 + f(eps_1) = -0.00010546252856728773, clip gamma: 1e-05\n",
+ "2025-02-24 19:30:39,172 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: selected 31801 responses, requested 6201.0\n",
+ "2025-02-24 19:30:39,175 - SVTPrivacy - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: noise max: 0.0019834287722930743, median 0.00013931068783016654\n",
+ "2025-02-24 19:30:39,191 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: finished processing task\n",
+ "2025-02-24 19:30:39,191 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: try #1: sending task result to server\n",
+ "2025-02-24 19:30:39,191 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: checking task ...\n",
+ "2025-02-24 19:30:39,191 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:30:39,195 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: start to send task result to server\n",
+ "2025-02-24 19:30:39,195 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:30:39,199 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job]: got result from client site-2 for task: name=train, id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6\n",
+ "2025-02-24 19:30:39,200 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: validation metric 10 from client site-2\n",
+ "2025-02-24 19:30:39,280 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: finished processing client result by controller\n",
+ "2025-02-24 19:30:39,280 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-2 task_id:0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6\n",
+ "2025-02-24 19:30:39,284 - Communicator - INFO - SubmitUpdate size: 251.5KB (251476 Bytes). time: 0.088063 seconds\n",
+ "2025-02-24 19:30:39,284 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=0cd48cfa-c95d-4de0-8ff6-fbda9c4e89a6]: task result sent to server\n",
+ "2025-02-24 19:30:39,284 - ClientTaskWorker - INFO - Finished one task run for client: site-2 interval: 2 task_processed: True\n",
+ "2025-02-24 19:30:39,288 - nvflare.app_common.executors.task_script_runner - INFO - Accuracy of the network on the 10000 test images: 10 %\n",
+ "2025-02-24 19:30:39,291 - InProcessClientAPI - INFO - Try to send local model back to peer \n",
+ "2025-02-24 19:30:39,737 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: Delta_w: Max abs: 3.792212010012008e-05, Min abs: 6.787099094546223e-13, Median abs: 8.725046427571215e-07.\n",
+ "2025-02-24 19:30:39,738 - SVTPrivacy - INFO - total params: 62006, epsilon: 0.1, perparam budget 1.6126431220770844e-05, threshold tau: 1e-06 + f(eps_1) = -0.00014016834743477033, clip gamma: 1e-05\n",
+ "2025-02-24 19:30:39,743 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: selected 31924 responses, requested 6201.0\n",
+ "2025-02-24 19:30:39,745 - SVTPrivacy - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: noise max: 0.001544578122705486, median 0.0001410220052472861\n",
+ "2025-02-24 19:30:39,754 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: finished processing task\n",
+ "2025-02-24 19:30:39,754 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: try #1: sending task result to server\n",
+ "2025-02-24 19:30:39,754 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: checking task ...\n",
+ "2025-02-24 19:30:39,754 - Cell - INFO - broadcast: channel='aux_communication', topic='__task_check__', targets=['server.simulate_job'], timeout=5.0\n",
+ "2025-02-24 19:30:39,757 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: start to send task result to server\n",
+ "2025-02-24 19:30:39,757 - FederatedClient - INFO - Starting to push execute result.\n",
+ "2025-02-24 19:30:39,761 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job]: got result from client site-1 for task: name=train, id=bb22caa5-b594-4713-8316-170485b3d18d\n",
+ "2025-02-24 19:30:39,761 - IntimeModelSelector - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: validation metric 10 from client site-1\n",
+ "2025-02-24 19:30:39,839 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: finished processing client result by controller\n",
+ "2025-02-24 19:30:39,839 - SubmitUpdateCommand - INFO - submit_update process. client_name:site-1 task_id:bb22caa5-b594-4713-8316-170485b3d18d\n",
+ "2025-02-24 19:30:39,840 - Communicator - INFO - SubmitUpdate size: 251.5KB (251476 Bytes). time: 0.082771 seconds\n",
+ "2025-02-24 19:30:39,841 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: task result sent to server\n",
+ "2025-02-24 19:30:39,841 - ClientTaskWorker - INFO - Finished one task run for client: site-1 interval: 2 task_processed: True\n",
+ "2025-02-24 19:30:39,937 - WFCommServer - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: task train exit with status TaskCompletionStatus.OK\n",
+ "2025-02-24 19:30:40,119 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: aggregating 2 update(s) at round 4\n",
+ "2025-02-24 19:30:40,120 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: Start persist model on server.\n",
+ "2025-02-24 19:30:40,123 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: End persist model on server.\n",
+ "2025-02-24 19:30:40,123 - FedAvg - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job, peer_rc=OK, task_name=train, task_id=bb22caa5-b594-4713-8316-170485b3d18d]: Finished FedAvg.\n",
+ "2025-02-24 19:30:40,123 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Workflow: controller finalizing ...\n",
+ "2025-02-24 19:30:40,123 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: ABOUT_TO_END_RUN fired\n",
+ "2025-02-24 19:30:40,124 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:30:40,127 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: received request from Server to end current RUN\n",
+ "2025-02-24 19:30:40,127 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: received request from Server to end current RUN\n",
+ "2025-02-24 19:30:40,624 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:30:41,125 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:30:41,288 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-2, peer_run=simulate_job]: server runner is finalizing - asked client to end the run\n",
+ "2025-02-24 19:30:41,289 - GetTaskCommand - INFO - return task to client. client_name: site-2 task_name: __end_run__ task_id: sharable_header_task_id: \n",
+ "2025-02-24 19:30:41,291 - FederatedClient - INFO - pull_task completed. Task name:__end_run__ Status:True \n",
+ "2025-02-24 19:30:41,292 - ClientRunner - INFO - [identity=site-2, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: server asked to end the run\n",
+ "2025-02-24 19:30:41,292 - ClientRunner - INFO - [identity=site-2, run=simulate_job]: started end-run events sequence\n",
+ "2025-02-24 19:30:41,292 - ClientRunner - INFO - [identity=site-2, run=simulate_job]: ABOUT_TO_END_RUN fired\n",
+ "2025-02-24 19:30:41,293 - ClientRunner - INFO - [identity=site-2, run=simulate_job]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:30:41,293 - InProcessClientAPI - WARNING - ask to stop job: reason: END_RUN received\n",
+ "2025-02-24 19:30:41,626 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:30:41,648 - InProcessClientAPI - WARNING - request to stop the job for reason END_RUN received\n",
+ "2025-02-24 19:30:41,650 - ClientRunner - INFO - [identity=site-2, run=simulate_job]: END_RUN fired\n",
+ "2025-02-24 19:30:41,650 - ClientTaskWorker - INFO - End the Simulator run.\n",
+ "2025-02-24 19:30:41,691 - ClientTaskWorker - INFO - Clean up ClientRunner for : site-2 \n",
+ "2025-02-24 19:30:41,693 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00002 Not Connected] is closed PID: 3595106\n",
+ "2025-02-24 19:30:41,693 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00006 Not Connected] is closed PID: 3595083\n",
+ "2025-02-24 19:30:41,844 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller, peer=site-1, peer_run=simulate_job]: server runner is finalizing - asked client to end the run\n",
+ "2025-02-24 19:30:41,844 - GetTaskCommand - INFO - return task to client. client_name: site-1 task_name: __end_run__ task_id: sharable_header_task_id: \n",
+ "2025-02-24 19:30:41,846 - FederatedClient - INFO - pull_task completed. Task name:__end_run__ Status:True \n",
+ "2025-02-24 19:30:41,846 - ClientRunner - INFO - [identity=site-1, run=simulate_job, peer=simulator_server, peer_run=simulate_job]: server asked to end the run\n",
+ "2025-02-24 19:30:41,846 - ClientRunner - INFO - [identity=site-1, run=simulate_job]: started end-run events sequence\n",
+ "2025-02-24 19:30:41,846 - ClientRunner - INFO - [identity=site-1, run=simulate_job]: ABOUT_TO_END_RUN fired\n",
+ "2025-02-24 19:30:41,846 - ClientRunner - INFO - [identity=site-1, run=simulate_job]: Firing CHECK_END_RUN_READINESS ...\n",
+ "2025-02-24 19:30:41,847 - InProcessClientAPI - WARNING - ask to stop job: reason: END_RUN received\n",
+ "2025-02-24 19:30:42,292 - InProcessClientAPI - WARNING - request to stop the job for reason END_RUN received\n",
+ "2025-02-24 19:30:42,296 - ClientRunner - INFO - [identity=site-1, run=simulate_job]: END_RUN fired\n",
+ "2025-02-24 19:30:42,296 - ClientTaskWorker - INFO - End the Simulator run.\n",
+ "2025-02-24 19:30:42,337 - ClientTaskWorker - INFO - Clean up ClientRunner for : site-1 \n",
+ "2025-02-24 19:30:42,338 - FederatedClient - INFO - Shutting down client run: site-1\n",
+ "2025-02-24 19:30:42,339 - FederatedClient - INFO - Shutting down client run: site-2\n",
+ "2025-02-24 19:30:42,340 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00002 Not Connected] is closed PID: 3595105\n",
+ "2025-02-24 19:30:42,340 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: asked to abort - triggered abort_signal to stop the RUN\n",
+ "2025-02-24 19:30:42,340 - nvflare.fuel.f3.sfm.conn_manager - INFO - Connection [CN00005 Not Connected] is closed PID: 3595083\n",
+ "2025-02-24 19:30:43,637 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: END_RUN fired\n",
+ "2025-02-24 19:30:43,637 - ReliableMessage - INFO - ReliableMessage is shutdown\n",
+ "2025-02-24 19:30:43,637 - ServerRunner - INFO - [identity=simulator_server, run=simulate_job, wf=controller]: Server runner finished.\n",
+ "2025-02-24 19:30:43,874 - ReliableMessage - INFO - shutdown reliable message monitor\n",
+ "2025-02-24 19:30:45,919 - SimulatorServer - INFO - Server app stopped.\n",
+ "\n",
+ "\n",
+ "2025-02-24 19:30:46,120 - nvflare.fuel.hci.server.hci - INFO - Admin Server localhost on Port 49545 shutdown!\n",
+ "2025-02-24 19:30:46,121 - SimulatorServer - INFO - shutting down server\n",
+ "2025-02-24 19:30:46,121 - SimulatorServer - INFO - canceling sync locks\n",
+ "2025-02-24 19:30:46,121 - SimulatorServer - INFO - server off\n",
+ "2025-02-24 19:30:49,472 - MPM - INFO - MPM: Good Bye!\n"
+ ]
+ }
+ ],
"source": [
"job.simulator_run(f\"/tmp/nvflare/{job.name}\", gpu=\"0\")"
]