From adf9c437326260ce147a16432346b38403c941df Mon Sep 17 00:00:00 2001
From: Holger Roth <hroth@nvidia.com>
Date: Wed, 31 Jan 2024 10:14:21 -0500
Subject: [PATCH] update gnn readme

---
 examples/advanced/gnn/README.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/examples/advanced/gnn/README.md b/examples/advanced/gnn/README.md
index 3a743431ce..f8c0c415ea 100644
--- a/examples/advanced/gnn/README.md
+++ b/examples/advanced/gnn/README.md
@@ -31,7 +31,7 @@ python3 -m pip install -r requirements.txt
 ```
 To support functions of PyTorch Geometric necessary for this example, we need extra dependencies. Please refer to [installation guide](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html) and install accordingly:
 ```
-pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.1.0+cpu.html
+python3 -m pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.1.0+cpu.html
 ```
 
 #### Job Template
@@ -46,8 +46,8 @@ nvflare job list_templates
 We can see the "sag_gnn" template is available
 
 #### Protein Classification
-The PPI dataset is directly available via torch_geometric library, we randomly split the dataset to 2 subsets, one for each client.
-First, we run the local training on each client, as well as the whole dataset.
+The PPI dataset is directly available via torch_geometric library, we randomly split the dataset to 2 subsets, one for each client (`--client_id 1` and `--client_id 2`).
+First, we run the local training on each client, as well as the whole dataset with `--client_id 0`.
 ```
 python3 code/graphsage_protein_local.py --client_id 0
 python3 code/graphsage_protein_local.py --client_id 1
@@ -64,7 +64,7 @@ For client configs, we set client_ids for each client, and the number of local e
 
 For server configs, we set the number of rounds for federated training, the key metric for model selection, and the model class path with model hyperparameters.
 
-With the produced job, we run the federated training on both clients via FedAvg using NVFlare Simulator.
+With the produced job, we run the federated training on both clients via FedAvg using the NVFlare Simulator.
 ```
 nvflare simulator -w /tmp/nvflare/gnn/protein_fl_workspace -n 2 -t 2 /tmp/nvflare/jobs/gnn_protein
 ```
@@ -74,7 +74,7 @@ We first download the Elliptic++ dataset to `/tmp/nvflare/datasets/elliptic_pp`
 - `txs_classes.csv`: transaction id and its class (licit or illicit)
 - `txs_edgelist.csv`: connections for transaction ids 
 - `txs_features.csv`: transaction id and its features
-Then, we run the local training on each client, as well as the whole dataset.
+Then, we run the local training on each client, as well as the whole dataset. Again, `--client_id 0` uses all data.
 ```
 python3 code/graphsage_finance_local.py --client_id 0
 python3 code/graphsage_finance_local.py --client_id 1
@@ -87,7 +87,7 @@ nvflare job create -force -j "/tmp/nvflare/jobs/gnn_finance" -w "sag_gnn" -sd "c
   -f app_2/config_fed_client.conf app_script="graphsage_finance_fl.py" app_config="--client_id 2 --epochs 10" \
   -f app_server/config_fed_server.conf num_rounds=7 key_metric="validation_auc" model_class_path="pyg_sage.SAGE" components[0].args.model.args.in_channels=165  components[0].args.model.args.hidden_channels=256 components[0].args.model.args.num_layers=3 components[0].args.model.args.num_classes=2  
 ```
-And with the produced job, we run the federated training on both clients via FedAvg using NVFlare Simulator.
+And with the produced job, we run the federated training on both clients via FedAvg using the NVFlare Simulator.
 ```
 nvflare simulator -w /tmp/nvflare/gnn/finance_fl_workspace -n 2 -t 2 /tmp/nvflare/jobs/gnn_finance
 ```