diff --git a/examples/advanced/vertical_xgboost/README.md b/examples/advanced/vertical_xgboost/README.md
deleted file mode 100644
index bddf82f2a6..0000000000
--- a/examples/advanced/vertical_xgboost/README.md
+++ /dev/null
@@ -1,103 +0,0 @@
-# Vertical Federated XGBoost
-This example shows how to use vertical federated learning with [NVIDIA FLARE](https://nvflare.readthedocs.io/en/main/index.html) on tabular data.
-Here we use the optimized gradient boosting library [XGBoost](https://github.com/dmlc/xgboost) and leverage its federated learning support.
-
-Before starting please make sure you set up a [virtual environment](../../README.md#set-up-a-virtual-environment) and install the additional requirements:
-```
-python3 -m pip install -r requirements.txt
-```
-
-## Preparing HIGGS Data
-In this example we showcase a binary classification task based on the [HIGGS dataset](https://mlphysics.ics.uci.edu/data/higgs/), which contains 11 million instances, each with 28 features and 1 class label.
-
-### Download and Store Dataset
-First download the dataset from the HIGGS link above, which is a single zipped `.csv` file.
-By default, we assume the dataset is downloaded, uncompressed, and stored in `DATASET_ROOT/HIGGS.csv`.
-
-### Vertical Data Splits
-In vertical federated learning, sites share overlapping data samples (rows), but contain different features (columns).
-In order to achieve this, we split the HIGGS dataset both horizontally and vertically. As a result, each site has an overlapping subset of the rows and a  subset of the 29 columns. Since the first column of HIGGS is the class label, we give site-1 the label column for simplicity's sake.
-
-<img src="./figs/vertical_fl.png" alt="vertical fl diagram" width="500"/>
-
-Run the following command to prepare the data splits:
-```
-./prepare_data.sh DATASET_ROOT
-```
-> **_NOTE:_** make sure to put the correct path for `DATASET_ROOT`.
-
-### Private Set Intersection (PSI)
-Since not every site will have the same set of data samples (rows), we can use PSI to compare encrypted versions of the sites' datasets in order to jointly compute the intersection based on common IDs. In this example, the HIGGS dataset does not contain unique identifiers so we add a temporary `uid_{idx}` to each instance and give each site a portion of the HIGGS dataset that includes a common overlap. Afterwards the identifiers are dropped since they are only used for matching, and training is then done on the intersected data. To learn more about our PSI protocol implementation, see our [psi example](../psi/README.md).
-
-> **_NOTE:_** The uid can be a composition of multiple variables with a transformation, however in this example we use indices for simplicity. PSI can also be used for computing the intersection of overlapping features, but here we give each site unique features.
-
-Create the psi job using the predefined psi_csv template:
-```
-nvflare job create -j ./jobs/vertical_xgb_psi -w psi_csv -sd ./code/psi -force
-```
-
-Run the psi job to calculate the dataset intersection of the clients at `psi/intersection.txt` inside the psi workspace:
-```
-nvflare simulator ./jobs/vertical_xgb_psi -w /tmp/nvflare/vertical_xgb_psi -n 2 -t 2
-```
-
-## Vertical XGBoost Federated Learning with FLARE
-
-This Vertical XGBoost example leverages the recently added [vertical federated learning support](https://github.com/dmlc/xgboost/issues/8424) in the XGBoost open-source library. This allows for the distributed XGBoost algorithm to operate in a federated manner on vertically split data.
-
-For integrating with FLARE, we can use the predefined `XGBFedController` to run the federated server and control the workflow.
-
-Next, we can use `FedXGBHistogramExecutor` and set XGBoost training parameters in `config_fed_client.json`, or define new training logic by overwriting the `xgb_train()` method.
-
-Lastly, we must subclass `XGBDataLoader` and implement the `load_data()` method. For vertical federated learning, it is important when creating the `xgb.Dmatrix` to set `data_split_mode=1` for column mode, and to specify the presence of a label column `?format=csv&label_column=0` for the csv file. To support PSI, the dataloader can also read in the dataset based on the calculated intersection, and split the data into training and validation.
-
-> **_NOTE:_** For secure mode, make sure to provide the required certificates for the federated communicator.
-
-## Run the Example
-Create the vertical xgboost job using the predefined vertical_xgb template:
-```
-nvflare job create -j ./jobs/vertical_xgb -w vertical_xgb -sd ./code/vertical_xgb -force
-```
-
-Run the vertical xgboost job:
-```
-nvflare simulator ./jobs/vertical_xgb -w /tmp/nvflare/vertical_xgb -n 2 -t 2
-```
-
-The model will be saved to `test.model.json`.
-
-(Feel free to modify the scripts and jobs as desired to change arguments such as number of clients, dataset sizes, training params, etc.)
-
-### GPU Support
-By default, CPU based training is used.
-
-In order to enable GPU accelerated training, first ensure that your machine has CUDA installed and has at least one GPU.
-In `config_fed_client.json` set `"use_gpus": true` and  `"tree_method": "hist"` in `xgb_params`.
-Then, in `FedXGBHistogramExecutor` we can use the `device` parameter to map each rank to a GPU device ordinal in `xgb_params`.
-If using multiple GPUs, we can map each rank to a different GPU device, however you can also map each rank to the same GPU device if using a single GPU.
-
-We can create a GPU enabled job using the job CLI:
-```
-nvflare job create -j ./jobs/vertical_xgb_gpu -w vertical_xgb \
--f config_fed_client.conf \
--f config_fed_server.conf use_gpus=true \
--sd ./code/vertical_xgb \
--force
-```
-
-This job can be run:
-```
-nvflare simulator ./jobs/vertical_xgb_gpu -w /tmp/nvflare/vertical_xgb_gpu -n 2 -t 2
-```
-
-## Results
-Model accuracy can be visualized in tensorboard:
-```
-tensorboard --logdir /tmp/nvflare/vertical_xgb/server/simulate_job/tb_events
-```
-
-An example training (pink) and validation (orange) AUC graph from running vertical XGBoost on HIGGS:
-(Used an intersection of 50000 samples across 5 clients each with different features,
-and ran for ~50 rounds due to early stopping.)
-
-![Vertical XGBoost graph](./figs/vertical_xgboost_graph.png)
diff --git a/examples/advanced/vertical_xgboost/figs/vertical_xgboost_graph.png b/examples/advanced/vertical_xgboost/figs/vertical_xgboost_graph.png
deleted file mode 100644
index 56e7f2c03c..0000000000
Binary files a/examples/advanced/vertical_xgboost/figs/vertical_xgboost_graph.png and /dev/null differ
diff --git a/examples/advanced/vertical_xgboost/prepare_data.sh b/examples/advanced/vertical_xgboost/prepare_data.sh
deleted file mode 100755
index d938e0c4eb..0000000000
--- a/examples/advanced/vertical_xgboost/prepare_data.sh
+++ /dev/null
@@ -1,23 +0,0 @@
-#!/usr/bin/env bash
-DATASET_PATH="${1}/HIGGS.csv"
-OUTPUT_PATH="/tmp/nvflare/vertical_xgb_data"
-OUTPUT_FILE="higgs.data.csv"
-
-if [ ! -f "${DATASET_PATH}" ]
-then
-    echo "Please check if you saved HIGGS dataset in ${DATASET_PATH}"
-fi
-
-echo "Generating HIGGS data splits, reading from ${DATASET_PATH}"
-
-python3 utils/prepare_data.py \
---data_path "${DATASET_PATH}" \
---site_num 2 \
---rows_total_percentage 0.02 \
---rows_overlap_percentage 0.25 \
---out_path "${OUTPUT_PATH}" \
---out_file "${OUTPUT_FILE}"
-
-# Note: HIGGS has 11000000 preshuffled instances; using rows_total_percentage to reduce PSI time for example
-
-echo "Data splits are generated in ${OUTPUT_PATH}"
diff --git a/examples/advanced/vertical_xgboost/requirements.txt b/examples/advanced/vertical_xgboost/requirements.txt
deleted file mode 100644
index b0d1cd29e7..0000000000
--- a/examples/advanced/vertical_xgboost/requirements.txt
+++ /dev/null
@@ -1,7 +0,0 @@
-nvflare~=2.5.0rc
-openmined.psi==1.1.1
-pandas
-torch
-tensorboard
-# require xgboost 2.2 version, for now need to install a nightly build
-https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/federated-secure/xgboost-2.2.0.dev0%2B4601688195708f7c31fcceeb0e0ac735e7311e61-py3-none-manylinux_2_28_x86_64.whl
diff --git a/examples/advanced/xgboost/README.md b/examples/advanced/xgboost/README.md
index 207e94334b..9c9844d217 100644
--- a/examples/advanced/xgboost/README.md
+++ b/examples/advanced/xgboost/README.md
@@ -1,220 +1,54 @@
 # Federated Learning for XGBoost 
-
-Please make sure you set up virtual environment and Jupyterlab follows [example root readme](../../README.md)
-
-## Introduction to XGBoost and HIGGS Data
-
-You can also follow along in this [notebook](./data_job_setup.ipynb) for an interactive experience.
-
-### XGBoost
-These examples show how to use [NVIDIA FLARE](https://nvflare.readthedocs.io/en/main/index.html) on tabular data applications.
-They use [XGBoost](https://github.com/dmlc/xgboost),
-which is an optimized distributed gradient boosting library.
-
-### HIGGS
-The examples illustrate a binary classification task based on [HIGGS dataset](https://mlphysics.ics.uci.edu/data/higgs/).
-This dataset contains 11 million instances, each with 28 attributes.
-
-Please note that the UCI's website may experience occasional downtime.
-
-## Federated Training of XGBoost
-Several mechanisms have been proposed for training an XGBoost model in a federated learning setting.
-In these examples, we illustrate the use of NVFlare to carry out *horizontal* federated learning using two approaches: histogram-based collaboration and tree-based collaboration.
-
-### Horizontal Federated Learning
-Under horizontal setting, each participant / client joining the federated learning will have part of the whole data / instances / examples/ records, while each instance has all the features.
-This is in contrast to vertical federated learning, where each client has part of the feature values for each instance.
-
-#### Histogram-based Collaboration
-The histogram-based collaboration federated XGBoost approach leverages NVFlare integration of recently added [federated learning support](https://github.com/dmlc/xgboost/issues/7778) in the XGBoost open-source library,
-which allows the existing *distributed* XGBoost training algorithm to operate in a federated manner,
-with the federated clients acting as the distinct workers in the distributed XGBoost algorithm.
-
-In distributed XGBoost, the individual workers share and aggregate coarse information about their respective portions of the training data,
-as required to optimize tree node splitting when building the successive boosted trees.
-
-The shared information is in the form of quantile sketches of feature values as well as corresponding sample gradient and sample Hessian histograms.
-
-Under federated histogram-based collaboration, precisely the same information is exchanged among the clients.
-
-The main differences are that the data is partitioned across the workers according to client data ownership, rather than being arbitrarily partionable, and all communication is via an aggregating federated [gRPC](https://grpc.io) server instead of direct client-to-client communication.
-
-Histograms from different clients, in particular, are aggregated in the server and then communicated back to the clients.
-
-See [histogram-based/README](histogram-based/README.md) for more information on the histogram-based collaboration.
-
-#### Tree-based Collaboration
-Under tree-based collaboration, individual trees are independently trained on each client's local data without aggregating the global sample gradient histogram information.
-Trained trees are collected and passed to the server / other clients for aggregation and further boosting rounds.
-
-The XGBoost Booster api is leveraged to create in-memory Booster objects that persist across rounds to cache predictions from trees added in previous rounds and retain other data structures needed for training.
-
-See [tree-based/README](tree-based/README.md) for more information on two different types of tree-based collaboration algorithms.
-
-
-## HIGGS Data Preparation
-For data preparation, you can follow this [notebook](./data_job_setup.ipynb):
-### Download and Store Data
-To run the examples, we first download the dataset from the HIGGS link above, which is a single `.csv` file.
-By default, we assume the dataset is downloaded, uncompressed, and stored in `~/dataset/HIGGS.csv`.
-
-> **_NOTE:_** If the dataset is downloaded in another place,
-> make sure to modify the corresponding `DATASET_PATH` inside `prepare_data.sh`.
-
-### Data Split
-Since HIGGS dataset is already randomly recorded,
-data split will be specified by the continuous index ranges for each client,
-rather than a vector of random instance indices.
-We provide four options to split the dataset to simulate the non-uniformity in data quantity: 
-
-1. uniform: all clients has the same amount of data 
-2. linear: the amount of data is linearly correlated with the client ID (1 to M)
-3. square: the amount of data is correlated with the client ID in a squared fashion (1^2 to M^2)
-4. exponential: the amount of data is correlated with the client ID in an exponential fashion (exp(1) to exp(M))
-
-The choice of data split depends on dataset and the number of participants.
-
-For a large dataset like HIGGS, if the number of clients is small (e.g. 5),
-each client will still have sufficient data to train on with uniform split,
-and hence exponential would be used to observe the performance drop caused by non-uniform data split.
-If the number of clients is large (e.g. 20), exponential split will be too aggressive, and linear/square should be used.
-
-Data splits used in this example can be generated with
-```
-bash prepare_data.sh
-```
-
-This will generate data splits for three client sizes: 2, 5 and 20, and 3 split conditions: uniform, square, and exponential.
-If you want to customize for your experiments, please check `utils/prepare_data_split.py`.
-
-> **_NOTE:_** The generated train config files will be stored in the folder `/tmp/nvflare/xgboost_higgs_dataset/`,
-> and will be used by jobs by specifying the path within `config_fed_client.json` 
-
-
-## HIGGS job configs preparation under various training schemes
-
-Please follow the [Installation](../../getting_started/README.md) instructions to install NVFlare.
-
-We then prepare the NVFlare job configs for different settings by running
+This example demonstrates how to use NVFlare to train an XGBoost model in a federated learning setting. 
+Several potential variations of federated XGBoost are illustrated, including:
+- non-secure horizontal collaboration with histogram-based and tree-based mechanisms.
+- non-secure vertical collaboration with histogram-based mechanism.
+- secure horizontal and vertical collaboration with histogram-based mechanism and homomorphic encryption.
+
+To run the examples and notebooks, please make sure you set up a virtual environment and Jupyterlab, following [the example root readme](../../README.md)
+and install the additional requirements:
 ```
-bash prepare_job_config.sh
+python3 -m pip install -r requirements.txt
 ```
 
-This script modifies settings from base job configuration
-(`./tree-based/jobs/bagging_base` or `./tree-based/jobs/cyclic_base`
-or `./histogram-based/jobs/base`),
-and copies the correct data split file generated in the data preparation step.
-
-> **_NOTE:_** To customize your own job configs, you can just edit from the generated ones.
-> Or check the code in `./utils/prepare_job_config.py`.
-
-The script will generate a total of 10 different configs in `tree-based/jobs` for tree-based algorithm:
+## XGBoost 
+XGBosot is a machine learning algorithm that uses decision/regression trees to perform classification and regression tasks, 
+mapping a vector of feature values to its label prediction. It is especially powerful for tabular data, so even in the age of LLM, 
+it still widely used for many tabular data use cases. It is also preferred for its explainability and efficiency.
 
-- tree-based cyclic training with uniform data split for 5 clients
-- tree-based cyclic training with non-uniform data split for 5 clients
-- tree-based bagging training with uniform data split and uniform shrinkage for 5 clients
-- tree-based bagging training with non-uniform data split and uniform shrinkage for 5 clients
-- tree-based bagging training with non-uniform data split and scaled shrinkage for 5 clients
-- tree-based cyclic training with uniform data split for 20 clients
-- tree-based cyclic training with non-uniform data split for 20 clients
-- tree-based bagging training with uniform data split and uniform shrinkage for 20 clients
-- tree-based bagging training with non-uniform data split and uniform shrinkage for 20 clients
-- tree-based bagging training with non-uniform data split and scaled shrinkage for 20 clients
+In these examples, we use [DMLC XGBoost](https://github.com/dmlc/xgboost), which is an optimized distributed gradient boosting library. 
+It offers advanced features like GPU accelerated capabilities, and distributed/federated learning support.
 
+## Data 
+We use two datasets: [HIGGS](https://mlphysics.ics.uci.edu/data/higgs/) and [creditcardfraud](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud)
+to perform the experiments, both of them are binary classification task, but of significantly different scales:
+HIGGS dataset contains 11 million instances, each with 28 attributes; while creditcardfraud dataset contains 284,807 instances, each with 30 attributes.
 
-The script will also generate 2 configs in `histogram-based/jobs` for histogram-base algorithm:
+We use the HIGGS dataset to compare the performance of different federated learning settings for its large scale; 
+and the creditcardfraud dataset to demonstrate the secure federated learning with homomorphic encryption for computation efficiency.
+Please note that the websites may experience occasional downtime.
 
-- histogram-based training with uniform data split for 2 clients
-- histogram-based training with uniform data split for 5 clients
+First download the dataset from the links above, which is a single zipped `HIGGS.csv.gz` file and a single `creditcard.csv` file.
+By default, we assume the dataset is downloaded, uncompressed, and stored in `DATASET_ROOT/HIGGS.csv` and `DATASET_ROOT/creditcard.csv`.
+Each row corresponds to a data sample, and each column corresponds to a feature. 
 
-## Run experiments for tree-based and histogram-based settings
-After you run the two scripts `prepare_data.sh` and `prepare_job_config.sh`,
-please go to sub-folder [tree-based](tree-based) for running tree-based algorithms,
-and sub-folder [histogram-based](histogram-based) for running histogram-based algorithms.
+## Collaboration Modes and Data Split
+Essentially there are two collaboration modes: horizontal and vertical:
+- In horizontal case, each participant has access to the same features (columns) of different data samples (rows). 
+In this case, everyone holds equal status as "label owner"
+- In vertical case, each client has access to different features (columns) of the same data samples (rows).
+We assume that only one is the "label owner" (or we call it as the "active party")
 
+To simulate the above two collaboration modes, we split the two datasets both horizontally and vertically, and 
+we give site-1 the label column for simplicity.
 
-## GPU support
-By default, CPU based training is used.
-
-If the CUDA is installed on the site, tree construction and prediction can be
-accelerated using GPUs.
-
-To enable GPU accelerated training, in `config_fed_client.json` set `"use_gpus": true` and  `"tree_method": "hist"`.
-Then, in `FedXGBHistogramExecutor` we use the `device` parameter to map each rank to a GPU device ordinal in `xgb_params`.
-For a single GPU, assuming it has enough memory, we can map each rank to the same device with `params["device"] = f"cuda:0"`.
-
-### Multi GPU support
-
-Multiple GPUs can be supported by running one NVFlare client for each GPU.
-
-In the `xgb_params`, we can set the `device` parameter to map each rank to a corresponding GPU device ordinal in with `params["device"] = f"cuda:{self.rank}"`
-
-Assuming there are 2 physical client sites, each with 2 GPUs (id 0 and 1).
-We can start 4 NVFlare client processes (site-1a, site-1b, site-2a, site-2b), one for each GPU.
-The job layout looks like this,
-::
-
-    xgb_multi_gpu_job
-    ├── app_server
-    │   └── config
-    │       └── config_fed_server.json
-    ├── app_site1_gpu0
-    │   └── config
-    │       └── config_fed_client.json
-    ├── app_site1_gpu1
-    │   └── config
-    │       └── config_fed_client.json
-    ├── app_site2_gpu0
-    │   └── config
-    │       └── config_fed_client.json
-    ├── app_site2_gpu1
-    │   └── config
-    │       └── config_fed_client.json
-    └── meta.json
-
-Each app is deployed to its own client site. Here is the `meta.json`,
-::
-
-    {
-      "name": "xgb_multi_gpu_job",
-      "resource_spec": {
-        "site-1a": {
-          "num_of_gpus": 1,
-          "mem_per_gpu_in_GiB": 1
-        },
-        "site-1b": {
-          "num_of_gpus": 1,
-          "mem_per_gpu_in_GiB": 1
-        },
-        "site-2a": {
-          "num_of_gpus": 1,
-          "mem_per_gpu_in_GiB": 1
-        },
-        "site-2b": {
-          "num_of_gpus": 1,
-          "mem_per_gpu_in_GiB": 1
-        }
-      },
-      "deploy_map": {
-        "app_server": [
-          "server"
-        ],
-        "app_site1_gpu0": [
-          "site-1a"
-        ],
-        "app_site1_gpu1": [
-          "site-1b"
-        ],
-        "app_site2_gpu0": [
-          "site-2a"
-        ],
-        "app_site2_gpu1": [
-          "site-2b"
-        ]
-      },
-      "min_clients": 4
-    }
-
-For federated XGBoost, all clients must participate in the training. Therefore,
-`min_clients` must equal to the number of clients.
-
+## Federated Training of XGBoost
+Continue with this example for two scenarios:
+### [Federated XGBoost without Encryption](./fedxgb/README.md)
+This example includes instructions on running federated XGBoost without encryption under histogram-based and tree-based horizontal 
+collaboration, and histogram-based vertical collaboration.
+
+### [Secure Federated XGBoost with Homomorphic Encryption](./fedxgb_secure/README.md)
+This example includes instructions on running secure federated XGBoost with homomorphic encryption under 
+histogram-based horizontal and vertical collaboration. Note that tree-based collaboration does not have security concerns 
+that can be handled by encryption.
\ No newline at end of file
diff --git a/examples/advanced/xgboost/fedxgb/README.md b/examples/advanced/xgboost/fedxgb/README.md
new file mode 100644
index 0000000000..0eb70f7a38
--- /dev/null
+++ b/examples/advanced/xgboost/fedxgb/README.md
@@ -0,0 +1,212 @@
+# Federated XGBoost
+Several mechanisms have been proposed for training an XGBoost model in a federated learning setting.
+In these examples, we illustrate the use of NVFlare to carry out *horizontal* federated learning using two approaches: histogram-based collaboration and tree-based collaboration.
+And *vertical* federated learning using histogram-based collaboration.
+
+## Horizontal Federated XGBoost
+Under horizontal setting, each participant joining the federated learning will have part of 
+the whole data samples / instances / records, while each sample has all the features.
+
+### Histogram-based Collaboration
+The histogram-based collaboration federated XGBoost approach leverages NVFlare integration of [federated learning support](https://github.com/dmlc/xgboost/issues/7778) in the XGBoost open-source library,
+which allows the existing *distributed* XGBoost training algorithm to operate in a federated manner,
+with the federated clients acting as the distinct workers in the distributed XGBoost algorithm.
+
+In distributed XGBoost, the individual workers share and aggregate gradient information about their respective portions of the training data,
+as required to optimize tree node splitting when building the successive boosted trees.
+
+The shared information is in the form of quantile sketches of feature values as well as corresponding sample gradient and sample Hessian histograms.
+
+Under federated histogram-based collaboration, precisely the same information is exchanged among the clients.
+The main differences are that the data is partitioned across the workers according to client data ownership, rather than being arbitrarily partionable, and all communication is via an aggregating federated [gRPC](https://grpc.io) server instead of direct client-to-client communication.
+Histograms from different clients, in particular, are aggregated in the server and then communicated back to the clients.
+
+### Tree-based Collaboration
+Under tree-based collaboration, individual trees are independently trained on each client's local data without aggregating the global sample gradient histogram information.
+Trained trees are collected and passed to the server / other clients for aggregation and / or further boosting rounds.
+Under this setting, we can further distinguish between two types of tree-based collaboration: cyclic and bagging.
+
+#### Cyclic Training
+"Cyclic XGBoost" is one way of performing tree-based federated boosting with 
+multiple sites: at each round of tree boosting, instead of relying on the whole 
+data statistics collected from all clients, the boosting relies on only 1 client's 
+local data. The resulting tree sequence is then forwarded to the next client for 
+next round's boosting. Such training scheme have been proposed in literatures [1] [2].
+
+#### Bagging Aggregation
+
+"Bagging XGBoost" is another way of performing tree-based federated boosting with multiple sites: at each round of tree boosting, all sites start from the same "global model", and boost a number of trees (in current example, 1 tree) based on their local data. The resulting trees are then send to server. A bagging aggregation scheme is applied to all the submitted trees to update the global model, which is further distributed to all clients for next round's boosting. 
+
+This scheme bears certain similarity to the [Random Forest mode](https://xgboost.readthedocs.io/en/stable/tutorials/rf.html) of XGBoost, where a `num_parallel_tree` is boosted based on random row/col splits, rather than a single tree. Under federated learning setting, such split is fixed to clients rather than random and without column subsampling. 
+
+In addition to basic uniform shrinkage setting where all clients have the same learning rate, based on our research, we enabled scaled shrinkage across clients for weighted aggregation according to each client's data size, which is shown to significantly improve the model's performance on non-uniform quantity splits over HIGGS data.
+
+
+Specifically, the global model is updated by aggregating the trees from all clients as a forest, and the global model is then broadcasted back to all clients for local prediction and further training.
+
+The XGBoost Booster api is leveraged to create in-memory Booster objects that persist across rounds to cache predictions from trees added in previous rounds and retain other data structures needed for training.
+
+## Vertical Federated XGBoost
+Under vertical setting, each participant joining the federated learning will 
+have part of the whole features, while each site has all the overlapping instances.
+
+### Private Set Intersection (PSI)
+Since not every site will have the same set of data samples (rows), we will use PSI to first compare encrypted versions of the sites' datasets in order to jointly compute the intersection based on common IDs. In the following example, we add a `uid_{idx}` to each instance and give each site 
+a portion of the dataset that includes a common overlap. After PSI, the identifiers are dropped since they are only used for matching, and training is then done on the intersected data. To learn more about our PSI protocol implementation, see our [psi example](../../psi/README.md).
+> **_NOTE:_** The uid can be a composition of multiple variables with a transformation, however in this example we use indices for simplicity. PSI can also be used for computing the intersection of overlapping features, but here we give each site unique features.
+
+### Histogram-based Collaboration
+Similar to its horizontal counterpart, under vertical collaboration, histogram-based collaboration will 
+aggregate the gradient information from each site and update the global model accordingly, resulting in
+the same model as the centralized / histogram-based horizontal training. 
+We leverage the [vertical federated learning support](https://github.com/dmlc/xgboost/issues/8424) in the XGBoost open-source library. This allows for the distributed XGBoost algorithm to operate in a federated manner on vertically split data.
+
+## Data Preparation
+Assuming the HIGGS data has been downloaded following [the instructions](../README.md), we further split the data 
+horizontally and vertically for federated learning.
+
+In horizontal settings, each party holds different data samples with the same set of features.
+To simulate this, we split the HIGGS data by rows, and assigning each party a subset of the data samples.
+In vertical settings, each party holds different features of the same data samples, and usually, the population 
+on each site will not fully overlap. To simulate this, we split the HIGGS data by both columns and rows, each site
+will have different features with overlapping data samples.
+More details will be provided in the following sub-sections.
+ 
+
+Data splits used in this example can be generated with
+```
+DATASET_ROOT=~/.cache/dataset/HIGGS
+bash prepare_data.sh ${DATASET_ROOT}
+```
+Please modify the path according to your own dataset location. 
+The generated horizontal train config files and vertical data files will be stored in the 
+folder `/tmp/nvflare/dataset/`, this output path can be changed in the script `prepare_data.sh`.
+
+### Horizontal Data Split
+Since HIGGS dataset is already randomly recorded,
+horizontal data split will be specified by the continuous index ranges for each client,
+rather than a vector of random instance indices.
+We provide four options to split the dataset to simulate the non-uniformity in data quantity: 
+
+1. uniform: all clients has the same amount of data 
+2. linear: the amount of data is linearly correlated with the client ID (1 to M)
+3. square: the amount of data is correlated with the client ID in a squared fashion (1^2 to M^2)
+4. exponential: the amount of data is correlated with the client ID in an exponential fashion (exp(1) to exp(M))
+
+The choice of data split depends on dataset and the number of participants.
+
+For a large dataset like HIGGS, if the number of clients is small (e.g. 5),
+each client will still have sufficient data to train on with uniform split,
+and hence exponential would be used to observe the performance drop caused by non-uniform data split.
+If the number of clients is large (e.g. 20), exponential split will be too aggressive, and linear/square should be used.
+
+In this example, we generate data splits with three client sizes: 2, 5 and 20, under three split conditions: uniform, square, and exponential.
+
+### Vertical Data Split
+For vertical, we simulate a realistic 2-client scenario where participants share overlapping data samples (rows) with different features (columns).
+We split the HIGGS dataset both horizontally and vertically. As a result, each site has an overlapping subset of the rows and a  subset of the 29 columns. Since the first column of HIGGS is the class label, we give site-1 the label column for simplicity's sake.
+<img src="./figs/vertical_fl.png" alt="vertical fl diagram" width="500"/> 
+
+PSI will be performed first to identify and match the overlapping samples, then the training will be done on the intersected data.
+
+
+## Experiments
+We first run the centralized trainings to get the baseline performance, then run the federated XGBoost training using NVFlare Simulator via [JobAPI](https://nvflare.readthedocs.io/en/main/programming_guide/fed_job_api.html).
+
+### Centralized Baselines
+For centralize training, we train the XGBoost model on the whole dataset, as well as subsets with different subsample rates
+and parallel tree settings.
+```
+bash run_experiment_centralized.sh ${DATASET_ROOT}
+```
+The results by default will be stored in the folder `/tmp/nvflare/workspace/centralized/`.
+
+![Centralized validation curve](./figs/Centralized.png)
+
+As shown, including multiple trees in a single round may not yield significant performance gain,
+and can even make the accuracy worse if subsample rate is too low (e.g. 0.05).
+
+### Horizontal Experiments
+The following cases will be covered:
+- Histogram-based collaboration based on uniform data split for 2 / 5 clients
+- Tree-based collaboration with cyclic training based on uniform / exponential / square data split for 5 / 20 clients
+- Tree-based collaboration with bagging training based on uniform / exponential / square data split for 5 / 20 clients w/ and w/o scaled learning rate
+
+Histogram-based experiments can be run with:
+```
+bash run_experiment_horizontal_histogram.sh
+```
+
+> **_NOTE:_** "histogram_v2" implements a fault-tolerant XGBoost training by using 
+> NVFlare as the communicator rather than relying on XGBoost MPI, for more information, please refer to this [TechBlog](https://developer.nvidia.com/blog/federated-xgboost-made-practical-and-productive-with-nvidia-flare/).
+
+Model accuracy curve during training can be visualized in tensorboard, 
+recorded in the simulator folder under `/tmp/nvflare/workspace/works/`.
+As expected, we can observe that all histogram-based experiments results in identical curves as centeralized training:
+![Horizontal Histogram XGBoost Graph](./figs/histogram.png)
+
+Tree-based experiments can be run with:
+```
+bash run_experiment_horizontal_tree.sh
+```
+The resulting validation AUC curves are shown below:
+
+![5 clients validation curve](./figs/5_client.png)
+![20 clients validation curve](./figs/20_client.png)
+
+As illustrated, we can have the following observations:
+- cyclic training performs ok under uniform split (the purple curve), however under non-uniform split, it will have significant performance drop (the brown curve)
+- bagging training performs better than cyclic under both uniform and non-uniform data splits (orange v.s. purple, red/green v.s. brown)
+- with uniform shrinkage, bagging will have significant performance drop under non-uniform split (green v.s. orange)
+- data-size dependent shrinkage will be able to recover the performance drop above (red v.s. green), and achieve comparable/better performance as uniform data split (red v.s. orange) 
+- bagging under uniform data split (orange), and bagging with data-size dependent shrinkage under non-uniform data split(red), can achieve comparable/better performance as compared with centralized training baseline (blue)
+
+For model size, centralized training and cyclic training will have a model consisting of `num_round` trees,
+while the bagging models consist of `num_round * num_client` trees, since each round,
+bagging training boosts a forest consisting of individually trained trees from each client.
+
+### Vertical Experiments
+
+Create the psi job using the predefined psi_csv template:
+```
+nvflare job create -j ./jobs/vertical_xgb_psi -w psi_csv -sd ./code/psi -force
+```
+
+Run the psi job to calculate the dataset intersection of the clients at `psi/intersection.txt` inside the psi workspace:
+```
+nvflare simulator ./jobs/vertical_xgb_psi -w /tmp/nvflare/vertical_xgb_psi -n 2 -t 2
+```
+
+Create the vertical xgboost job using the predefined vertical_xgb template:
+```
+nvflare job create -j ./jobs/vertical_xgb -w vertical_xgb -sd ./code/vertical_xgb -force
+```
+
+Run the vertical xgboost job:
+```
+nvflare simulator ./jobs/vertical_xgb -w /tmp/nvflare/vertical_xgb -n 2 -t 2
+```
+
+Model accuracy can be visualized in tensorboard:
+```
+tensorboard --logdir /tmp/nvflare/vertical_xgb/server/simulate_job/tb_events
+```
+
+An example validation AUC graph (red) from running vertical XGBoost on HIGGS as compared with baseline centralized (blue):
+Since in this case we only used ~50k samples, the performance is worse than centralized training using full dataset.
+
+![Vertical XGBoost graph](./figs/vertical_xgb.png)
+
+## GPU Support
+By default, CPU based training is used.
+
+In order to enable GPU accelerated training, first ensure that your machine has CUDA installed and has at least one GPU.
+In `XGBFedController` set `"use_gpus": true`.
+Then, in `FedXGBHistogramExecutor` we can use the `device` parameter to map each rank to a GPU device ordinal in `xgb_params`.
+If using multiple GPUs, we can map each rank to a different GPU device, however you can also map each rank to the same GPU device if using a single GPU.
+
+
+## Reference
+[1] Zhao, L. et al., "InPrivate Digging: Enabling Tree-based Distributed Data Mining with Differential Privacy," IEEE INFOCOM 2018 - IEEE Conference on Computer Communications, 2018, pp. 2087-2095
+
+[2] Yamamoto, F. et al., "New Approaches to Federated XGBoost Learning for Privacy-Preserving Data Analysis," ICONIP 2020 - International Conference on Neural Information Processing, 2020, Lecture Notes in Computer Science, vol 12533 
diff --git a/examples/advanced/xgboost/tree-based/figs/20_client.png b/examples/advanced/xgboost/fedxgb/figs/20_client.png
similarity index 100%
rename from examples/advanced/xgboost/tree-based/figs/20_client.png
rename to examples/advanced/xgboost/fedxgb/figs/20_client.png
diff --git a/examples/advanced/xgboost/tree-based/figs/5_client.png b/examples/advanced/xgboost/fedxgb/figs/5_client.png
similarity index 100%
rename from examples/advanced/xgboost/tree-based/figs/5_client.png
rename to examples/advanced/xgboost/fedxgb/figs/5_client.png
diff --git a/examples/advanced/xgboost/tree-based/figs/Centralized.png b/examples/advanced/xgboost/fedxgb/figs/Centralized.png
similarity index 100%
rename from examples/advanced/xgboost/tree-based/figs/Centralized.png
rename to examples/advanced/xgboost/fedxgb/figs/Centralized.png
diff --git a/examples/advanced/xgboost/fedxgb/figs/histogram.png b/examples/advanced/xgboost/fedxgb/figs/histogram.png
new file mode 100644
index 0000000000..fb95949a7e
Binary files /dev/null and b/examples/advanced/xgboost/fedxgb/figs/histogram.png differ
diff --git a/examples/advanced/vertical_xgboost/figs/vertical_fl.png b/examples/advanced/xgboost/fedxgb/figs/vertical_fl.png
similarity index 100%
rename from examples/advanced/vertical_xgboost/figs/vertical_fl.png
rename to examples/advanced/xgboost/fedxgb/figs/vertical_fl.png
diff --git a/examples/advanced/xgboost/fedxgb/figs/vertical_xgb.png b/examples/advanced/xgboost/fedxgb/figs/vertical_xgb.png
new file mode 100644
index 0000000000..b5d9e0a2d1
Binary files /dev/null and b/examples/advanced/xgboost/fedxgb/figs/vertical_xgb.png differ
diff --git a/examples/advanced/xgboost/data_job_setup.ipynb b/examples/advanced/xgboost/fedxgb/notebooks/data_job_setup.ipynb
similarity index 100%
rename from examples/advanced/xgboost/data_job_setup.ipynb
rename to examples/advanced/xgboost/fedxgb/notebooks/data_job_setup.ipynb
diff --git a/examples/advanced/xgboost/histogram-based/xgboost_histogram_higgs.ipynb b/examples/advanced/xgboost/fedxgb/notebooks/xgboost_histogram_higgs.ipynb
similarity index 100%
rename from examples/advanced/xgboost/histogram-based/xgboost_histogram_higgs.ipynb
rename to examples/advanced/xgboost/fedxgb/notebooks/xgboost_histogram_higgs.ipynb
diff --git a/examples/advanced/xgboost/tree-based/xgboost_tree_higgs.ipynb b/examples/advanced/xgboost/fedxgb/notebooks/xgboost_tree_higgs.ipynb
similarity index 100%
rename from examples/advanced/xgboost/tree-based/xgboost_tree_higgs.ipynb
rename to examples/advanced/xgboost/fedxgb/notebooks/xgboost_tree_higgs.ipynb
diff --git a/examples/advanced/xgboost/fedxgb/prepare_data.sh b/examples/advanced/xgboost/fedxgb/prepare_data.sh
new file mode 100755
index 0000000000..5d687bb5a7
--- /dev/null
+++ b/examples/advanced/xgboost/fedxgb/prepare_data.sh
@@ -0,0 +1,37 @@
+#!/usr/bin/env bash
+DATASET_PATH="${1}/HIGGS.csv"
+if [ ! -f "${DATASET_PATH}" ]
+then
+    echo "Please check if you saved HIGGS dataset in ${DATASET_PATH}"
+    exit 1
+fi
+
+echo "Generating HIGGS data splits, reading from ${DATASET_PATH}"
+
+OUTPUT_PATH="/tmp/nvflare/dataset/xgboost_higgs_horizontal"
+for site_num in 2 5 20;
+do
+    for split_mode in uniform exponential square;
+    do
+        python3 utils/prepare_data_horizontal.py \
+        --data_path "${DATASET_PATH}" \
+        --site_num ${site_num} \
+        --size_total 11000000 \
+        --size_valid 1000000 \
+        --split_method ${split_mode} \
+        --out_path "${OUTPUT_PATH}/${site_num}_${split_mode}"
+    done
+done
+echo "Horizontal data splits are generated in ${OUTPUT_PATH}"
+
+OUTPUT_PATH="/tmp/nvflare/dataset/xgboost_higgs_vertical"
+OUTPUT_FILE="higgs.data.csv"
+# Note: HIGGS has 11 million preshuffled instances; using rows_total_percentage to reduce PSI time for example
+python3 utils/prepare_data_vertical.py \
+--data_path "${DATASET_PATH}" \
+--site_num 2 \
+--rows_total_percentage 0.02 \
+--rows_overlap_percentage 0.25 \
+--out_path "${OUTPUT_PATH}" \
+--out_file "${OUTPUT_FILE}"
+echo "Vertical data splits are generated in ${OUTPUT_PATH}"
diff --git a/examples/advanced/xgboost/fedxgb/run_experiment_centralized.sh b/examples/advanced/xgboost/fedxgb/run_experiment_centralized.sh
new file mode 100755
index 0000000000..4a34d6e8c3
--- /dev/null
+++ b/examples/advanced/xgboost/fedxgb/run_experiment_centralized.sh
@@ -0,0 +1,15 @@
+#!/usr/bin/env bash
+DATASET_PATH="${1}/HIGGS.csv"
+if [ ! -f "${DATASET_PATH}" ]
+then
+    echo "Please check if you saved HIGGS dataset in ${DATASET_PATH}"
+    exit 1
+fi
+
+python3 utils/baseline_centralized.py --num_parallel_tree 1 --data_path "${DATASET_PATH}"
+python3 utils/baseline_centralized.py --num_parallel_tree 1 --data_path "${DATASET_PATH}" --train_in_one_session
+python3 utils/baseline_centralized.py --num_parallel_tree 5 --subsample 0.8 --data_path "${DATASET_PATH}"
+python3 utils/baseline_centralized.py --num_parallel_tree 5 --subsample 0.2 --data_path "${DATASET_PATH}"
+python3 utils/baseline_centralized.py --num_parallel_tree 20 --subsample 0.8 --data_path "${DATASET_PATH}"
+python3 utils/baseline_centralized.py --num_parallel_tree 20 --subsample 0.05 --data_path "${DATASET_PATH}"
+
diff --git a/examples/advanced/xgboost/fedxgb/run_experiment_horizontal_histogram.sh b/examples/advanced/xgboost/fedxgb/run_experiment_horizontal_histogram.sh
new file mode 100755
index 0000000000..c04735e15f
--- /dev/null
+++ b/examples/advanced/xgboost/fedxgb/run_experiment_horizontal_histogram.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+python3 xgb_fl_job_horizontal.py --site_num 2 --training_algo histogram --split_method uniform --lr_mode uniform --data_split_mode horizontal
+python3 xgb_fl_job_horizontal.py --site_num 5 --training_algo histogram --split_method uniform --lr_mode uniform --data_split_mode horizontal
+python3 xgb_fl_job_horizontal.py --site_num 2 --training_algo histogram_v2 --split_method uniform --lr_mode uniform --data_split_mode horizontal
+python3 xgb_fl_job_horizontal.py --site_num 5 --training_algo histogram_v2 --split_method uniform --lr_mode uniform --data_split_mode horizontal
diff --git a/examples/advanced/xgboost/fedxgb/run_experiment_horizontal_tree.sh b/examples/advanced/xgboost/fedxgb/run_experiment_horizontal_tree.sh
new file mode 100755
index 0000000000..3ab5573e63
--- /dev/null
+++ b/examples/advanced/xgboost/fedxgb/run_experiment_horizontal_tree.sh
@@ -0,0 +1,12 @@
+#!/usr/bin/env bash
+python3 xgb_fl_job_horizontal.py --site_num 5 --training_algo bagging --split_method exponential --lr_mode uniform --data_split_mode horizontal
+python3 xgb_fl_job_horizontal.py --site_num 5 --training_algo bagging --split_method exponential --lr_mode scaled --data_split_mode horizontal
+python3 xgb_fl_job_horizontal.py --site_num 5 --training_algo bagging --split_method uniform --lr_mode uniform --data_split_mode horizontal
+python3 xgb_fl_job_horizontal.py --site_num 5 --training_algo cyclic --split_method exponential --lr_mode uniform --data_split_mode horizontal
+python3 xgb_fl_job_horizontal.py --site_num 5 --training_algo cyclic --split_method uniform --lr_mode uniform --data_split_mode horizontal
+
+python3 xgb_fl_job_horizontal.py --site_num 20 --training_algo bagging --split_method square --lr_mode uniform --data_split_mode horizontal
+python3 xgb_fl_job_horizontal.py --site_num 20 --training_algo bagging --split_method square --lr_mode scaled --data_split_mode horizontal
+python3 xgb_fl_job_horizontal.py --site_num 20 --training_algo bagging --split_method uniform --lr_mode uniform --data_split_mode horizontal
+python3 xgb_fl_job_horizontal.py --site_num 20 --training_algo cyclic --split_method square --lr_mode uniform --data_split_mode horizontal
+python3 xgb_fl_job_horizontal.py --site_num 20 --training_algo cyclic --split_method uniform --lr_mode uniform --data_split_mode horizontal
\ No newline at end of file
diff --git a/examples/advanced/xgboost/fedxgb/run_experiment_vertical.sh b/examples/advanced/xgboost/fedxgb/run_experiment_vertical.sh
new file mode 100755
index 0000000000..35abee98fd
--- /dev/null
+++ b/examples/advanced/xgboost/fedxgb/run_experiment_vertical.sh
@@ -0,0 +1,3 @@
+#!/usr/bin/env bash
+python3 xgb_fl_job_vertical_psi.py
+python3 xgb_fl_job_vertical.py
\ No newline at end of file
diff --git a/examples/advanced/xgboost/histogram-based/jobs/base/app/custom/higgs_data_loader.py b/examples/advanced/xgboost/fedxgb/src/higgs_data_loader.py
similarity index 100%
rename from examples/advanced/xgboost/histogram-based/jobs/base/app/custom/higgs_data_loader.py
rename to examples/advanced/xgboost/fedxgb/src/higgs_data_loader.py
diff --git a/examples/advanced/vertical_xgboost/code/psi/local_psi.py b/examples/advanced/xgboost/fedxgb/src/local_psi.py
similarity index 100%
rename from examples/advanced/vertical_xgboost/code/psi/local_psi.py
rename to examples/advanced/xgboost/fedxgb/src/local_psi.py
diff --git a/examples/advanced/vertical_xgboost/code/vertical_xgb/vertical_data_loader.py b/examples/advanced/xgboost/fedxgb/src/vertical_data_loader.py
similarity index 100%
rename from examples/advanced/vertical_xgboost/code/vertical_xgb/vertical_data_loader.py
rename to examples/advanced/xgboost/fedxgb/src/vertical_data_loader.py
diff --git a/examples/advanced/xgboost/utils/baseline_centralized.py b/examples/advanced/xgboost/fedxgb/utils/baseline_centralized.py
similarity index 94%
rename from examples/advanced/xgboost/utils/baseline_centralized.py
rename to examples/advanced/xgboost/fedxgb/utils/baseline_centralized.py
index a462a539a7..dcac3e5f01 100644
--- a/examples/advanced/xgboost/utils/baseline_centralized.py
+++ b/examples/advanced/xgboost/fedxgb/utils/baseline_centralized.py
@@ -25,11 +25,13 @@
 
 def xgboost_args_parser():
     parser = argparse.ArgumentParser(description="Centralized XGBoost training with random forest options")
-    parser.add_argument("--data_path", type=str, default="./dataset/HIGGS_UCI.csv", help="path to dataset file")
+    parser.add_argument("--data_path", type=str, help="path to dataset file")
     parser.add_argument("--num_parallel_tree", type=int, default=1, help="num_parallel_tree for random forest setting")
     parser.add_argument("--subsample", type=float, default=1, help="subsample for random forest setting")
     parser.add_argument("--num_rounds", type=int, default=100, help="number of boosting rounds")
-    parser.add_argument("--workspace_root", type=str, default="workspaces", help="workspaces root")
+    parser.add_argument(
+        "--workspace_root", type=str, default="/tmp/nvflare/workspace/centralized", help="workspaces root"
+    )
     parser.add_argument("--tree_method", type=str, default="hist", help="tree_method")
     parser.add_argument("--train_in_one_session", action="store_true", help="whether to train in one session")
     return parser
@@ -63,7 +65,7 @@ def train_one_by_one(train_data, val_data, xgb_params, num_rounds, val_label, wr
             y_pred = bst_last.predict(val_data)
             roc = roc_auc_score(val_label, y_pred)
             print(f"Round: {bst_last.num_boosted_rounds()} model testing AUC {roc}")
-            writer.add_scalar("AUC", roc, r - 1)
+            writer.add_scalar("eval_metrics", roc, r - 1)
             # Train new model
             print(f"Round: {r} Base ", end="")
             bst = xgb.train(
@@ -152,7 +154,7 @@ def main():
     y_pred = bst.predict(dmat_valid)
     roc = roc_auc_score(y_higgs[0:valid_num], y_pred)
     print(f"Base model: {roc}")
-    writer.add_scalar("AUC", roc, num_rounds - 1)
+    writer.add_scalar("eval_metrics", roc, num_rounds - 1)
     writer.close()
 
 
diff --git a/examples/advanced/xgboost/utils/prepare_data_split.py b/examples/advanced/xgboost/fedxgb/utils/prepare_data_horizontal.py
similarity index 100%
rename from examples/advanced/xgboost/utils/prepare_data_split.py
rename to examples/advanced/xgboost/fedxgb/utils/prepare_data_horizontal.py
diff --git a/examples/advanced/vertical_xgboost/utils/prepare_data.py b/examples/advanced/xgboost/fedxgb/utils/prepare_data_vertical.py
similarity index 100%
rename from examples/advanced/vertical_xgboost/utils/prepare_data.py
rename to examples/advanced/xgboost/fedxgb/utils/prepare_data_vertical.py
diff --git a/examples/advanced/xgboost/fedxgb/xgb_fl_job_horizontal.py b/examples/advanced/xgboost/fedxgb/xgb_fl_job_horizontal.py
new file mode 100644
index 0000000000..718f7fd624
--- /dev/null
+++ b/examples/advanced/xgboost/fedxgb/xgb_fl_job_horizontal.py
@@ -0,0 +1,260 @@
+# Copyright (c) 2025, NVIDIA CORPORATION.  All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import json
+import os
+
+from src.higgs_data_loader import HIGGSDataLoader
+
+from nvflare.app_common.widgets.convert_to_fed_event import ConvertToFedEvent
+from nvflare.app_opt.tracking.tb.tb_receiver import TBAnalyticsReceiver
+from nvflare.app_opt.tracking.tb.tb_writer import TBWriter
+from nvflare.job_config.api import FedJob
+
+ALGO_DIR_MAP = {
+    "bagging": "tree-based",
+    "cyclic": "tree-based",
+    "histogram": "histogram-based",
+    "histogram_v2": "histogram-based",
+}
+
+
+def define_parser():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--data_root",
+        type=str,
+        default="/tmp/nvflare/dataset/xgboost_higgs",
+        help="Path to dataset files for each site",
+    )
+    parser.add_argument("--site_num", type=int, default=2, help="Total number of sites")
+    parser.add_argument("--round_num", type=int, default=100, help="Total number of training rounds")
+    parser.add_argument(
+        "--training_algo", type=str, default="histogram", choices=list(ALGO_DIR_MAP.keys()), help="Training algorithm"
+    )
+    parser.add_argument("--split_method", type=str, default="uniform", help="How to split the dataset")
+    parser.add_argument("--lr_mode", type=str, default="uniform", help="Whether to use uniform or scaled shrinkage")
+    parser.add_argument("--nthread", type=int, default=16, help="nthread for xgboost")
+    parser.add_argument(
+        "--tree_method", type=str, default="hist", help="tree_method for xgboost - use hist for best perf"
+    )
+    parser.add_argument(
+        "--data_split_mode",
+        type=str,
+        default="horizontal",
+        choices=["horizontal", "vertical"],
+        help="dataset split mode, horizontal or vertical",
+    )
+    return parser.parse_args()
+
+
+def _get_job_name(args) -> str:
+    return f"higgs_{args.site_num}_{args.training_algo}_{args.split_method}_split_{args.lr_mode}_lr"
+
+
+def _get_data_path(args) -> str:
+    return f"{args.data_root}_{args.data_split_mode}/{args.site_num}_{args.split_method}"
+
+
+def _read_json(filename):
+    if not os.path.isfile(filename):
+        raise ValueError(f"{filename} does not exist!")
+    with open(filename, "r") as f:
+        return json.load(f)
+
+
+def _get_lr_scale_from_split_json(data_split: dict):
+    split = {}
+    total_data_num = 0
+    for k, v in data_split["data_index"].items():
+        if k == "valid":
+            continue
+        data_num = int(v["end"] - v["start"])
+        total_data_num += data_num
+        split[k] = data_num
+
+    lr_scales = {}
+    for k in split:
+        lr_scales[k] = split[k] / total_data_num
+
+    return lr_scales
+
+
+def main():
+    args = define_parser()
+    job_name = _get_job_name(args)
+    dataset_path = _get_data_path(args)
+
+    site_num = args.site_num
+    job = FedJob(name=job_name, min_clients=site_num)
+
+    # Define the controller workflow and send to server
+    if args.training_algo == "histogram":
+        from nvflare.app_opt.xgboost.histogram_based.controller import XGBFedController
+
+        controller = XGBFedController()
+        from nvflare.app_opt.xgboost.histogram_based.executor import FedXGBHistogramExecutor
+
+        executor = FedXGBHistogramExecutor(
+            data_loader_id="dataloader",
+            num_rounds=args.round_num,
+            early_stopping_rounds=2,
+            metrics_writer_id="metrics_writer",
+            xgb_params={
+                "max_depth": 8,
+                "eta": 0.1,
+                "objective": "binary:logistic",
+                "eval_metric": "auc",
+                "tree_method": "hist",
+                "nthread": 16,
+            },
+        )
+        # Add tensorboard receiver to server
+        tb_receiver = TBAnalyticsReceiver(
+            tb_folder="tb_events",
+        )
+        job.to_server(tb_receiver, id="tb_receiver")
+    elif args.training_algo == "histogram_v2":
+        from nvflare.app_opt.xgboost.histogram_based_v2.fed_controller import XGBFedController
+
+        controller = XGBFedController(
+            num_rounds=args.round_num,
+            data_split_mode=0,
+            secure_training=False,
+            xgb_options={"early_stopping_rounds": 2, "use_gpus": False},
+            xgb_params={
+                "max_depth": 8,
+                "eta": 0.1,
+                "objective": "binary:logistic",
+                "eval_metric": "auc",
+                "tree_method": "hist",
+                "nthread": 16,
+            },
+        )
+        from nvflare.app_opt.xgboost.histogram_based_v2.fed_executor import FedXGBHistogramExecutor
+
+        executor = FedXGBHistogramExecutor(
+            data_loader_id="dataloader",
+            metrics_writer_id="metrics_writer",
+        )
+        # Add tensorboard receiver to server
+        tb_receiver = TBAnalyticsReceiver(
+            tb_folder="tb_events",
+        )
+        job.to_server(tb_receiver, id="tb_receiver")
+    elif args.training_algo == "bagging":
+        from nvflare.app_common.workflows.scatter_and_gather import ScatterAndGather
+
+        controller = ScatterAndGather(
+            min_clients=args.site_num,
+            num_rounds=args.round_num,
+            start_round=0,
+            aggregator_id="aggregator",
+            persistor_id="persistor",
+            shareable_generator_id="shareable_generator",
+            wait_time_after_min_received=0,
+            train_timeout=0,
+            allow_empty_global_weights=True,
+            task_check_period=0.01,
+            persist_every_n_rounds=0,
+            snapshot_every_n_rounds=0,
+        )
+        from nvflare.app_opt.xgboost.tree_based.model_persistor import XGBModelPersistor
+
+        persistor = XGBModelPersistor(save_name="xgboost_model.json")
+        from nvflare.app_opt.xgboost.tree_based.shareable_generator import XGBModelShareableGenerator
+
+        shareable_generator = XGBModelShareableGenerator()
+        from nvflare.app_opt.xgboost.tree_based.bagging_aggregator import XGBBaggingAggregator
+
+        aggregator = XGBBaggingAggregator()
+        job.to_server(persistor, id="persistor")
+        job.to_server(shareable_generator, id="shareable_generator")
+        job.to_server(aggregator, id="aggregator")
+    elif args.training_algo == "cyclic":
+        from nvflare.app_common.workflows.cyclic_ctl import CyclicController
+
+        controller = CyclicController(
+            num_rounds=int(args.round_num / args.site_num),
+            task_assignment_timeout=60,
+            persistor_id="persistor",
+            shareable_generator_id="shareable_generator",
+            task_check_period=0.01,
+            persist_every_n_rounds=0,
+            snapshot_every_n_rounds=0,
+        )
+        from nvflare.app_opt.xgboost.tree_based.model_persistor import XGBModelPersistor
+
+        persistor = XGBModelPersistor(save_name="xgboost_model.json", load_as_dict=False)
+        from nvflare.app_opt.xgboost.tree_based.shareable_generator import XGBModelShareableGenerator
+
+        shareable_generator = XGBModelShareableGenerator()
+        job.to_server(persistor, id="persistor")
+        job.to_server(shareable_generator, id="shareable_generator")
+    # send controller to server
+    job.to_server(controller, id="xgb_controller")
+
+    # Add executor and other components to clients
+    for site_id in range(1, site_num + 1):
+        if args.training_algo in ["bagging", "cyclic"]:
+            lr_scale = 1
+            num_client_bagging = 1
+            if args.training_algo == "bagging":
+                num_client_bagging = args.site_num
+            if args.lr_mode == "scaled":
+                data_split = _read_json(f"{dataset_path}/data_site-{site_id}.json")
+                lr_scales = _get_lr_scale_from_split_json(data_split)
+                lr_scale = lr_scales[f"site-{site_id}"]
+            from nvflare.app_opt.xgboost.tree_based.executor import FedXGBTreeExecutor
+
+            executor = FedXGBTreeExecutor(
+                data_loader_id="dataloader",
+                training_mode=args.training_algo,
+                num_client_bagging=num_client_bagging,
+                num_local_parallel_tree=1,
+                local_subsample=1,
+                local_model_path="model.json",
+                global_model_path="model_global.json",
+                learning_rate=0.1,
+                objective="binary:logistic",
+                max_depth=8,
+                eval_metric="auc",
+                tree_method="hist",
+                nthread=16,
+                lr_scale=lr_scale,
+                lr_mode=args.lr_mode,
+            )
+        job.to(executor, f"site-{site_id}")
+
+        dataloader = HIGGSDataLoader(data_split_filename=f"{dataset_path}/data_site-{site_id}.json")
+        job.to(dataloader, f"site-{site_id}", id="dataloader")
+
+        if args.training_algo in ["histogram", "histogram_v2"]:
+            metrics_writer = TBWriter(event_type="analytix_log_stats")
+            job.to(metrics_writer, f"site-{site_id}", id="metrics_writer")
+
+            event_to_fed = ConvertToFedEvent(
+                events_to_convert=["analytix_log_stats"],
+                fed_event_prefix="fed.",
+            )
+            job.to(event_to_fed, f"site-{site_id}", id="event_to_fed")
+
+    # Export job config and run the job
+    job.export_job("/tmp/nvflare/workspace/jobs/")
+    job.simulator_run(f"/tmp/nvflare/workspace/works/{job_name}")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/advanced/xgboost/fedxgb/xgb_fl_job_vertical.py b/examples/advanced/xgboost/fedxgb/xgb_fl_job_vertical.py
new file mode 100644
index 0000000000..717188f681
--- /dev/null
+++ b/examples/advanced/xgboost/fedxgb/xgb_fl_job_vertical.py
@@ -0,0 +1,107 @@
+# Copyright (c) 2025, NVIDIA CORPORATION.  All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+
+from src.vertical_data_loader import VerticalDataLoader
+
+from nvflare.app_common.widgets.convert_to_fed_event import ConvertToFedEvent
+from nvflare.app_opt.tracking.tb.tb_receiver import TBAnalyticsReceiver
+from nvflare.app_opt.tracking.tb.tb_writer import TBWriter
+from nvflare.app_opt.xgboost.histogram_based_v2.fed_controller import XGBFedController
+from nvflare.app_opt.xgboost.histogram_based_v2.fed_executor import FedXGBHistogramExecutor
+from nvflare.job_config.api import FedJob
+
+
+def define_parser():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--data_split_path",
+        type=str,
+        default="/tmp/nvflare/dataset/xgboost_higgs_vertical/{SITE_NAME}/higgs.data.csv",
+        help="Path to data split files for each site",
+    )
+    parser.add_argument(
+        "--psi_path",
+        type=str,
+        default="/tmp/nvflare/workspace/works/vertical_xgb_psi/{SITE_NAME}/simulate_job/{SITE_NAME}/psi/intersection.txt",
+        help="Path to psi files for each site",
+    )
+    parser.add_argument("--site_num", type=int, default=2, help="Total number of sites")
+    parser.add_argument("--round_num", type=int, default=100, help="Total number of training rounds")
+    return parser.parse_args()
+
+
+def main():
+    args = define_parser()
+    data_split_path = args.data_split_path
+    psi_path = args.psi_path
+    site_num = args.site_num
+    round_num = args.round_num
+    job_name = "xgboost_vertical"
+    job = FedJob(name=job_name, min_clients=site_num)
+
+    # Define the controller workflow and send to server
+    controller = XGBFedController(
+        num_rounds=round_num,
+        data_split_mode=1,
+        secure_training=False,
+        xgb_options={"early_stopping_rounds": 3, "use_gpus": False},
+        xgb_params={
+            "max_depth": 8,
+            "eta": 0.1,
+            "objective": "binary:logistic",
+            "eval_metric": "auc",
+            "tree_method": "hist",
+            "nthread": 16,
+        },
+    )
+    job.to_server(controller, id="xgb_controller")
+
+    # Add tensorboard receiver to server
+    tb_receiver = TBAnalyticsReceiver(
+        tb_folder="tb_events",
+    )
+    job.to_server(tb_receiver, id="tb_receiver")
+
+    # Define the executor and send to clients
+    executor = FedXGBHistogramExecutor(
+        data_loader_id="dataloader",
+        metrics_writer_id="metrics_writer",
+        in_process=True,
+        model_file_name="test.model.json",
+    )
+    job.to_clients(executor, id="xgb_hist_executor", tasks=["config", "start"])
+
+    dataloader = VerticalDataLoader(
+        data_split_path=data_split_path, psi_path=psi_path, id_col="uid", label_owner="site-1", train_proportion=0.8
+    )
+    job.to_clients(dataloader, id="dataloader")
+
+    metrics_writer = TBWriter(event_type="analytix_log_stats")
+    job.to_clients(metrics_writer, id="metrics_writer")
+
+    event_to_fed = ConvertToFedEvent(
+        events_to_convert=["analytix_log_stats"],
+        fed_event_prefix="fed.",
+    )
+    job.to_clients(event_to_fed, id="event_to_fed")
+
+    # Export job config and run the job
+    job.export_job("/tmp/nvflare/workspace/jobs/")
+    job.simulator_run(f"/tmp/nvflare/workspace/works/{job_name}", n_clients=site_num)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/advanced/xgboost/fedxgb/xgb_fl_job_vertical_psi.py b/examples/advanced/xgboost/fedxgb/xgb_fl_job_vertical_psi.py
new file mode 100644
index 0000000000..fc4954e772
--- /dev/null
+++ b/examples/advanced/xgboost/fedxgb/xgb_fl_job_vertical_psi.py
@@ -0,0 +1,70 @@
+# Copyright (c) 2025, NVIDIA CORPORATION.  All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+
+from src.local_psi import LocalPSI
+
+from nvflare.app_common.psi.dh_psi.dh_psi_controller import DhPSIController
+from nvflare.app_common.psi.file_psi_writer import FilePSIWriter
+from nvflare.app_common.psi.psi_executor import PSIExecutor
+from nvflare.app_opt.psi.dh_psi.dh_psi_task_handler import DhPSITaskHandler
+from nvflare.job_config.api import FedJob
+
+
+def define_parser():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--data_split_path",
+        type=str,
+        default="/tmp/nvflare/dataset/xgboost_higgs_vertical/site-x/higgs.data.csv",
+        help="Path to data split files for each site",
+    )
+    parser.add_argument("--site_num", type=int, default=2, help="Total number of sites")
+    parser.add_argument("--psi_path", type=str, default="psi/intersection.txt", help="PSI ouput path")
+    return parser.parse_args()
+
+
+def main():
+    args = define_parser()
+    data_split_path = args.data_split_path
+    psi_path = args.psi_path
+    site_num = args.site_num
+    job_name = "xgboost_vertical_psi"
+    job = FedJob(name=job_name, min_clients=site_num)
+
+    # Define the controller workflow and send to server
+    controller = DhPSIController()
+    job.to_server(controller)
+
+    # Define the executor and other components for each site
+    executor = PSIExecutor(psi_algo_id="dh_psi")
+    job.to_clients(executor, id="psi_executor", tasks=["PSI"])
+
+    local_psi = LocalPSI(psi_writer_id="psi_writer", data_split_path=data_split_path, id_col="uid")
+    job.to_clients(local_psi, id="local_psi")
+
+    task_handler = DhPSITaskHandler(local_psi_id="local_psi")
+    job.to_clients(task_handler, id="dh_psi")
+
+    psi_writer = FilePSIWriter(output_path=psi_path)
+    job.to_clients(psi_writer, id="psi_writer")
+
+    # Export job config and run the job
+    job.export_job("/tmp/nvflare/workspace/jobs/")
+    job.simulator_run(f"/tmp/nvflare/workspace/works/{job_name}", n_clients=site_num)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/advanced/xgboost_secure/.gitignore b/examples/advanced/xgboost/fedxgb_secure/.gitignore
similarity index 100%
rename from examples/advanced/xgboost_secure/.gitignore
rename to examples/advanced/xgboost/fedxgb_secure/.gitignore
diff --git a/examples/advanced/xgboost_secure/README.md b/examples/advanced/xgboost/fedxgb_secure/README.md
similarity index 100%
rename from examples/advanced/xgboost_secure/README.md
rename to examples/advanced/xgboost/fedxgb_secure/README.md
diff --git a/examples/advanced/xgboost_secure/figs/tree.base.png b/examples/advanced/xgboost/fedxgb_secure/figs/tree.base.png
similarity index 100%
rename from examples/advanced/xgboost_secure/figs/tree.base.png
rename to examples/advanced/xgboost/fedxgb_secure/figs/tree.base.png
diff --git a/examples/advanced/xgboost_secure/figs/tree.vert.secure.0.png b/examples/advanced/xgboost/fedxgb_secure/figs/tree.vert.secure.0.png
similarity index 100%
rename from examples/advanced/xgboost_secure/figs/tree.vert.secure.0.png
rename to examples/advanced/xgboost/fedxgb_secure/figs/tree.vert.secure.0.png
diff --git a/examples/advanced/xgboost_secure/figs/tree.vert.secure.1.png b/examples/advanced/xgboost/fedxgb_secure/figs/tree.vert.secure.1.png
similarity index 100%
rename from examples/advanced/xgboost_secure/figs/tree.vert.secure.1.png
rename to examples/advanced/xgboost/fedxgb_secure/figs/tree.vert.secure.1.png
diff --git a/examples/advanced/xgboost_secure/figs/tree.vert.secure.2.png b/examples/advanced/xgboost/fedxgb_secure/figs/tree.vert.secure.2.png
similarity index 100%
rename from examples/advanced/xgboost_secure/figs/tree.vert.secure.2.png
rename to examples/advanced/xgboost/fedxgb_secure/figs/tree.vert.secure.2.png
diff --git a/examples/advanced/xgboost_secure/prepare_data.sh b/examples/advanced/xgboost/fedxgb_secure/prepare_data.sh
similarity index 100%
rename from examples/advanced/xgboost_secure/prepare_data.sh
rename to examples/advanced/xgboost/fedxgb_secure/prepare_data.sh
diff --git a/examples/advanced/xgboost_secure/prepare_flare_job.sh b/examples/advanced/xgboost/fedxgb_secure/prepare_flare_job.sh
similarity index 100%
rename from examples/advanced/xgboost_secure/prepare_flare_job.sh
rename to examples/advanced/xgboost/fedxgb_secure/prepare_flare_job.sh
diff --git a/examples/advanced/xgboost_secure/project.yml b/examples/advanced/xgboost/fedxgb_secure/project.yml
similarity index 100%
rename from examples/advanced/xgboost_secure/project.yml
rename to examples/advanced/xgboost/fedxgb_secure/project.yml
diff --git a/examples/advanced/xgboost_secure/run_training_flare.sh b/examples/advanced/xgboost/fedxgb_secure/run_training_flare.sh
similarity index 100%
rename from examples/advanced/xgboost_secure/run_training_flare.sh
rename to examples/advanced/xgboost/fedxgb_secure/run_training_flare.sh
diff --git a/examples/advanced/xgboost_secure/run_training_standalone.sh b/examples/advanced/xgboost/fedxgb_secure/run_training_standalone.sh
similarity index 100%
rename from examples/advanced/xgboost_secure/run_training_standalone.sh
rename to examples/advanced/xgboost/fedxgb_secure/run_training_standalone.sh
diff --git a/examples/advanced/xgboost_secure/train_standalone/train_base.py b/examples/advanced/xgboost/fedxgb_secure/train_standalone/train_base.py
similarity index 98%
rename from examples/advanced/xgboost_secure/train_standalone/train_base.py
rename to examples/advanced/xgboost/fedxgb_secure/train_standalone/train_base.py
index 58db56b94c..a8762b9e29 100644
--- a/examples/advanced/xgboost_secure/train_standalone/train_base.py
+++ b/examples/advanced/xgboost/fedxgb_secure/train_standalone/train_base.py
@@ -38,7 +38,7 @@ def train_base_args_parser():
     parser.add_argument(
         "--out_path",
         type=str,
-        default="/tmp/nvflare/xgboost_secure/train_standalone/base",
+        default="/tmp/nvflare/fedxgb_secure/train_standalone/base",
         help="Output path for the data split file",
     )
     return parser
diff --git a/examples/advanced/xgboost_secure/train_standalone/train_federated.py b/examples/advanced/xgboost/fedxgb_secure/train_standalone/train_federated.py
similarity index 98%
rename from examples/advanced/xgboost_secure/train_standalone/train_federated.py
rename to examples/advanced/xgboost/fedxgb_secure/train_standalone/train_federated.py
index 808e88fa17..f4aad83054 100644
--- a/examples/advanced/xgboost_secure/train_standalone/train_federated.py
+++ b/examples/advanced/xgboost/fedxgb_secure/train_standalone/train_federated.py
@@ -48,7 +48,7 @@ def train_federated_args_parser():
     parser.add_argument(
         "--out_path",
         type=str,
-        default="/tmp/nvflare/xgboost_secure/train_standalone/federated",
+        default="/tmp/nvflare/fedxgb_secure/train_standalone/federated",
         help="Output path for the data split file",
     )
     return parser
diff --git a/examples/advanced/xgboost_secure/utils/prepare_data_base.py b/examples/advanced/xgboost/fedxgb_secure/utils/prepare_data_base.py
similarity index 100%
rename from examples/advanced/xgboost_secure/utils/prepare_data_base.py
rename to examples/advanced/xgboost/fedxgb_secure/utils/prepare_data_base.py
diff --git a/examples/advanced/xgboost_secure/utils/prepare_data_horizontal.py b/examples/advanced/xgboost/fedxgb_secure/utils/prepare_data_horizontal.py
similarity index 100%
rename from examples/advanced/xgboost_secure/utils/prepare_data_horizontal.py
rename to examples/advanced/xgboost/fedxgb_secure/utils/prepare_data_horizontal.py
diff --git a/examples/advanced/xgboost_secure/utils/prepare_data_traintest_split.py b/examples/advanced/xgboost/fedxgb_secure/utils/prepare_data_traintest_split.py
similarity index 100%
rename from examples/advanced/xgboost_secure/utils/prepare_data_traintest_split.py
rename to examples/advanced/xgboost/fedxgb_secure/utils/prepare_data_traintest_split.py
diff --git a/examples/advanced/xgboost_secure/utils/prepare_data_vertical.py b/examples/advanced/xgboost/fedxgb_secure/utils/prepare_data_vertical.py
similarity index 100%
rename from examples/advanced/xgboost_secure/utils/prepare_data_vertical.py
rename to examples/advanced/xgboost/fedxgb_secure/utils/prepare_data_vertical.py
diff --git a/examples/advanced/xgboost/histogram-based/README.md b/examples/advanced/xgboost/histogram-based/README.md
deleted file mode 100644
index 8c89f95eff..0000000000
--- a/examples/advanced/xgboost/histogram-based/README.md
+++ /dev/null
@@ -1,77 +0,0 @@
-# Histogram-based Federated Learning for XGBoost   
-
-## Run automated experiments
-Please make sure to finish the [preparation steps](../README.md) before running the following steps.
-To run this example with NVFlare, follow the steps below or this [notebook](./xgboost_histogram_higgs.ipynb) for an interactive experience.
-
-### Environment Preparation
-
-Switch to this directory and install additional requirements (suggest to do this inside virtual environment):
-```
-python3 -m pip install -r requirements.txt
-```
-
-### Run centralized experiments
-```
-bash run_experiment_centralized.sh
-```
-
-### Run federated experiments with simulator locally
-Next, we will use the NVFlare simulator to run FL training automatically.
-```
-nvflare simulator jobs/higgs_2_histogram_v2_uniform_split_uniform_lr \
-   -w /tmp/nvflare/xgboost_v2_workspace -n 2 -t 2
-```
-
-Model accuracy can be visualized in tensorboard:
-```
-tensorboard --logdir /tmp/nvflare/xgboost_v2_workspace/simulate_job/tb_events
-```
-
-### Run federated experiments in real world
-
-To run in a federated setting, follow [Real-World FL](https://nvflare.readthedocs.io/en/main/real_world_fl.html) to
-start the overseer, FL servers and FL clients.
-
-You need to download the HIGGS data on each client site.
-You will also need to install XGBoost on each client site and server site.
-
-You can still generate the data splits and job configs using the scripts provided.
-
-You will need to copy the generated data split file into each client site.
-You might also need to modify the `data_path` in the `data_site-XXX.json`
-inside the `/tmp/nvflare/xgboost_higgs_dataset` folder,
-since each site might save the HIGGS dataset in different places.
-
-Then, you can use the admin client to submit the job via the `submit_job` command.
-
-## Customization
-
-The provided XGBoost executor can be customized using boost parameters
-provided in the `xgb_params` argument.
-
-If the parameter change alone is not sufficient and code changes are required,
-a custom executor can be implemented to make calls to xgboost library directly.
-
-The custom executor can inherit the base class `FedXGBHistogramExecutor` and
-overwrite the `xgb_train()` method.
-
-To use a different dataset, you can inherit the base class `XGBDataLoader` and
-implement the `load_data()` method.
-
-## Loose integration
-
-We can use the NVFlare controller/executor just to launch the external xgboost
-federated server and client.
-
-### Run federated experiments with simulator locally
-Next, we will use the NVFlare simulator to run FL training automatically.
-```
-nvflare simulator jobs/higgs_2_histogram_uniform_split_uniform_lr \
-   -w /tmp/nvflare/xgboost_workspace -n 2 -t 2
-```
-
-Model accuracy can be visualized in tensorboard:
-```
-tensorboard --logdir /tmp/nvflare/xgboost_workspace/simulate_job/tb_events
-```
diff --git a/examples/advanced/xgboost/histogram-based/jobs/base/app/config/config_fed_client.json b/examples/advanced/xgboost/histogram-based/jobs/base/app/config/config_fed_client.json
deleted file mode 100755
index a3fe316d90..0000000000
--- a/examples/advanced/xgboost/histogram-based/jobs/base/app/config/config_fed_client.json
+++ /dev/null
@@ -1,50 +0,0 @@
-{
-  "format_version": 2,
-  "num_rounds": 100,
-  "executors": [
-    {
-      "tasks": [
-        "train"
-      ],
-      "executor": {
-        "id": "Executor",
-        "path": "nvflare.app_opt.xgboost.histogram_based.executor.FedXGBHistogramExecutor",
-        "args": {
-          "data_loader_id": "dataloader",
-          "num_rounds": "{num_rounds}",
-          "early_stopping_rounds": 2,
-          "metrics_writer_id": "metrics_writer",
-          "xgb_params": {
-            "max_depth": 8,
-            "eta": 0.1,
-            "objective": "binary:logistic",
-            "eval_metric": "auc",
-            "tree_method": "hist",
-            "nthread": 16
-          }
-        }
-      }
-    }
-  ],
-  "task_result_filters": [],
-  "task_data_filters": [],
-  "components": [
-    {
-      "id": "dataloader",
-      "path": "higgs_data_loader.HIGGSDataLoader",
-      "args": {
-        "data_split_filename": "data_split.json"
-      }
-    },
-    {
-      "id": "metrics_writer",
-      "path": "nvflare.app_opt.tracking.tb.tb_writer.TBWriter",
-      "args": {"event_type": "analytix_log_stats"}
-    },
-    {
-      "id": "event_to_fed",
-      "path": "nvflare.app_common.widgets.convert_to_fed_event.ConvertToFedEvent",
-      "args": {"events_to_convert": ["analytix_log_stats"], "fed_event_prefix": "fed."}
-    }
-  ]
-}
diff --git a/examples/advanced/xgboost/histogram-based/jobs/base/app/config/config_fed_server.json b/examples/advanced/xgboost/histogram-based/jobs/base/app/config/config_fed_server.json
deleted file mode 100755
index 9814f32e2c..0000000000
--- a/examples/advanced/xgboost/histogram-based/jobs/base/app/config/config_fed_server.json
+++ /dev/null
@@ -1,23 +0,0 @@
-{
-  "format_version": 2,
-  "task_data_filters": [],
-  "task_result_filters": [],
-  "components": [
-    {
-      "id": "tb_receiver",
-      "path": "nvflare.app_opt.tracking.tb.tb_receiver.TBAnalyticsReceiver",
-      "args": {
-        "tb_folder": "tb_events"
-      }
-    }
-  ],
-  "workflows": [
-    {
-      "id": "xgb_controller",
-      "path": "nvflare.app_opt.xgboost.histogram_based.controller.XGBFedController",
-      "args": {
-        "train_timeout": 30000
-      }
-    }
-  ]
-}
\ No newline at end of file
diff --git a/examples/advanced/xgboost/histogram-based/jobs/base/meta.json b/examples/advanced/xgboost/histogram-based/jobs/base/meta.json
deleted file mode 100644
index 68fc7c42e0..0000000000
--- a/examples/advanced/xgboost/histogram-based/jobs/base/meta.json
+++ /dev/null
@@ -1,10 +0,0 @@
-{
-  "name": "xgboost_histogram_based",
-  "resource_spec": {},
-  "deploy_map": {
-    "app": [
-      "@ALL"
-    ]
-  },
-  "min_clients": 2
-}
diff --git a/examples/advanced/xgboost/histogram-based/jobs/base_v2/app/config/config_fed_client.json b/examples/advanced/xgboost/histogram-based/jobs/base_v2/app/config/config_fed_client.json
deleted file mode 100755
index a23a960c3d..0000000000
--- a/examples/advanced/xgboost/histogram-based/jobs/base_v2/app/config/config_fed_client.json
+++ /dev/null
@@ -1,39 +0,0 @@
-{
-  "format_version": 2,
-  "executors": [
-    {
-      "tasks": [
-        "config", "start"
-      ],
-      "executor": {
-        "id": "Executor",
-        "path": "nvflare.app_opt.xgboost.histogram_based_v2.fed_executor.FedXGBHistogramExecutor",
-        "args": {
-          "data_loader_id": "dataloader",
-          "metrics_writer_id": "metrics_writer"
-        }
-      }
-    }
-  ],
-  "task_result_filters": [],
-  "task_data_filters": [],
-  "components": [
-    {
-      "id": "dataloader",
-      "path": "higgs_data_loader.HIGGSDataLoader",
-      "args": {
-        "data_split_filename": "data_split.json"
-      }
-    },
-    {
-      "id": "metrics_writer",
-      "path": "nvflare.app_opt.tracking.tb.tb_writer.TBWriter",
-      "args": {"event_type": "analytix_log_stats"}
-    },
-    {
-      "id": "event_to_fed",
-      "path": "nvflare.app_common.widgets.convert_to_fed_event.ConvertToFedEvent",
-      "args": {"events_to_convert": ["analytix_log_stats"], "fed_event_prefix": "fed."}
-    }
-  ]
-}
diff --git a/examples/advanced/xgboost/histogram-based/jobs/base_v2/app/config/config_fed_server.json b/examples/advanced/xgboost/histogram-based/jobs/base_v2/app/config/config_fed_server.json
deleted file mode 100755
index d0dd1e3908..0000000000
--- a/examples/advanced/xgboost/histogram-based/jobs/base_v2/app/config/config_fed_server.json
+++ /dev/null
@@ -1,37 +0,0 @@
-{
-  "format_version": 2,
-  "num_rounds": 100,
-  "task_data_filters": [],
-  "task_result_filters": [],
-  "components": [
-    {
-      "id": "tb_receiver",
-      "path": "nvflare.app_opt.tracking.tb.tb_receiver.TBAnalyticsReceiver",
-      "args": {
-        "tb_folder": "tb_events"
-      }
-    }
-  ],
-  "workflows": [
-    {
-      "id": "xgb_controller",
-      "path": "nvflare.app_opt.xgboost.histogram_based_v2.fed_controller.XGBFedController",
-      "args": {
-        "num_rounds": "{num_rounds}",
-        "data_split_mode": 0,
-        "secure_training": false,
-        "xgb_params": {
-          "max_depth": 8,
-          "eta": 0.1,
-          "objective": "binary:logistic",
-          "eval_metric": "auc",
-          "tree_method": "hist",
-          "nthread": 16
-        },
-        "xgb_options": {
-          "early_stopping_rounds": 2
-        }
-      }
-    }
-  ]
-}
\ No newline at end of file
diff --git a/examples/advanced/xgboost/histogram-based/jobs/base_v2/app/custom/higgs_data_loader.py b/examples/advanced/xgboost/histogram-based/jobs/base_v2/app/custom/higgs_data_loader.py
deleted file mode 100644
index 6623e35fa3..0000000000
--- a/examples/advanced/xgboost/histogram-based/jobs/base_v2/app/custom/higgs_data_loader.py
+++ /dev/null
@@ -1,77 +0,0 @@
-# Copyright (c) 2024, NVIDIA CORPORATION.  All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import json
-
-import pandas as pd
-import xgboost as xgb
-
-from nvflare.app_opt.xgboost.data_loader import XGBDataLoader
-
-
-def _read_higgs_with_pandas(data_path, start: int, end: int):
-    data_size = end - start
-    data = pd.read_csv(data_path, header=None, skiprows=start, nrows=data_size)
-    data_num = data.shape[0]
-
-    # split to feature and label
-    x = data.iloc[:, 1:].copy()
-    y = data.iloc[:, 0].copy()
-
-    return x, y, data_num
-
-
-class HIGGSDataLoader(XGBDataLoader):
-    def __init__(self, data_split_filename):
-        """Reads HIGGS dataset and return XGB data matrix.
-
-        Args:
-            data_split_filename: file name to data splits
-        """
-        self.data_split_filename = data_split_filename
-
-    def load_data(self):
-        with open(self.data_split_filename, "r") as file:
-            data_split = json.load(file)
-
-        data_path = data_split["data_path"]
-        data_index = data_split["data_index"]
-
-        # check if site_id and "valid" in the mapping dict
-        if self.client_id not in data_index.keys():
-            raise ValueError(
-                f"Data does not contain Client {self.client_id} split",
-            )
-
-        if "valid" not in data_index.keys():
-            raise ValueError(
-                "Data does not contain Validation split",
-            )
-
-        site_index = data_index[self.client_id]
-        valid_index = data_index["valid"]
-
-        # training
-        x_train, y_train, total_train_data_num = _read_higgs_with_pandas(
-            data_path=data_path, start=site_index["start"], end=site_index["end"]
-        )
-        dmat_train = xgb.DMatrix(x_train, label=y_train)
-
-        # validation
-        x_valid, y_valid, total_valid_data_num = _read_higgs_with_pandas(
-            data_path=data_path, start=valid_index["start"], end=valid_index["end"]
-        )
-        dmat_valid = xgb.DMatrix(x_valid, label=y_valid, data_split_mode=self.data_split_mode)
-
-        return dmat_train, dmat_valid
diff --git a/examples/advanced/xgboost/histogram-based/jobs/base_v2/meta.json b/examples/advanced/xgboost/histogram-based/jobs/base_v2/meta.json
deleted file mode 100644
index 6d82211a16..0000000000
--- a/examples/advanced/xgboost/histogram-based/jobs/base_v2/meta.json
+++ /dev/null
@@ -1,10 +0,0 @@
-{
-  "name": "xgboost_histogram_based_v2",
-  "resource_spec": {},
-  "deploy_map": {
-    "app": [
-      "@ALL"
-    ]
-  },
-  "min_clients": 2
-}
diff --git a/examples/advanced/xgboost/histogram-based/prepare_data.sh b/examples/advanced/xgboost/histogram-based/prepare_data.sh
deleted file mode 100755
index f7bdf9e68d..0000000000
--- a/examples/advanced/xgboost/histogram-based/prepare_data.sh
+++ /dev/null
@@ -1,5 +0,0 @@
-#!/usr/bin/env bash
-
-SCRIPT_DIR="$( dirname -- "$0"; )";
-
-bash "${SCRIPT_DIR}"/../prepare_data.sh
diff --git a/examples/advanced/xgboost/histogram-based/requirements.txt b/examples/advanced/xgboost/histogram-based/requirements.txt
deleted file mode 100644
index d79a5bef89..0000000000
--- a/examples/advanced/xgboost/histogram-based/requirements.txt
+++ /dev/null
@@ -1,9 +0,0 @@
-nvflare~=2.5.0rc
-pandas
-scikit-learn
-torch
-tensorboard
-matplotlib
-shap
-# require xgboost 2.2 version, for now need to install a nightly build
-https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/federated-secure/xgboost-2.2.0.dev0%2B4601688195708f7c31fcceeb0e0ac735e7311e61-py3-none-manylinux_2_28_x86_64.whl
diff --git a/examples/advanced/xgboost/histogram-based/run_experiment_centralized.sh b/examples/advanced/xgboost/histogram-based/run_experiment_centralized.sh
deleted file mode 100755
index 7a71f2d0a8..0000000000
--- a/examples/advanced/xgboost/histogram-based/run_experiment_centralized.sh
+++ /dev/null
@@ -1,9 +0,0 @@
-#!/usr/bin/env bash
-DATASET_PATH="$HOME/dataset/HIGGS.csv"
-
-if [ ! -f "${DATASET_PATH}" ]
-then
-    echo "Please check if you saved HIGGS dataset in ${DATASET_PATH}"
-    exit 1
-fi
-python3 ../utils/baseline_centralized.py --num_parallel_tree 1 --train_in_one_session --data_path "${DATASET_PATH}"
diff --git a/examples/advanced/xgboost/histogram-based/run_experiment_simulator.sh b/examples/advanced/xgboost/histogram-based/run_experiment_simulator.sh
deleted file mode 100755
index eb6861c326..0000000000
--- a/examples/advanced/xgboost/histogram-based/run_experiment_simulator.sh
+++ /dev/null
@@ -1,9 +0,0 @@
-#!/usr/bin/env bash
-
-n=2
-study=histogram_uniform_split_uniform_lr
-nvflare simulator jobs/higgs_${n}_${study} -w ${PWD}/workspaces/xgboost_workspace_${n}_${study} -n ${n} -t ${n}
-
-n=5
-study=histogram_uniform_split_uniform_lr
-nvflare simulator jobs/higgs_${n}_${study} -w ${PWD}/workspaces/xgboost_workspace_${n}_${study} -n ${n} -t ${n}
diff --git a/examples/advanced/xgboost/prepare_data.sh b/examples/advanced/xgboost/prepare_data.sh
deleted file mode 100755
index f1a2e28675..0000000000
--- a/examples/advanced/xgboost/prepare_data.sh
+++ /dev/null
@@ -1,25 +0,0 @@
-#!/usr/bin/env bash
-DATASET_PATH="$HOME/dataset/HIGGS.csv"
-OUTPUT_PATH="/tmp/nvflare/xgboost_higgs_dataset"
-SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
-
-if [ ! -f "${DATASET_PATH}" ]
-then
-    echo "Please check if you saved HIGGS dataset in ${DATASET_PATH}"
-fi
-
-echo "Generated HIGGS data splits, reading from ${DATASET_PATH}"
-for site_num in 2 5 20;
-do
-    for split_mode in uniform exponential square;
-    do
-        python3 ${SCRIPT_DIR}/utils/prepare_data_split.py \
-        --data_path "${DATASET_PATH}" \
-        --site_num ${site_num} \
-        --size_total 11000000 \
-        --size_valid 1000000 \
-        --split_method ${split_mode} \
-        --out_path "${OUTPUT_PATH}/${site_num}_${split_mode}"
-    done
-done
-echo "Data splits are generated in ${OUTPUT_PATH}"
diff --git a/examples/advanced/xgboost/prepare_job_config.sh b/examples/advanced/xgboost/prepare_job_config.sh
deleted file mode 100755
index f839b46242..0000000000
--- a/examples/advanced/xgboost/prepare_job_config.sh
+++ /dev/null
@@ -1,26 +0,0 @@
-#!/usr/bin/env bash
-TREE_METHOD="hist"
-
-prepare_job_config() {
-    python3 utils/prepare_job_config.py --site_num "$1" --training_algo "$2" --split_method "$3" \
-    --lr_mode "$4" --nthread 16 --tree_method "$5"
-}
-
-echo "Generating job configs"
-prepare_job_config 5 bagging exponential scaled $TREE_METHOD
-prepare_job_config 5 bagging exponential uniform $TREE_METHOD
-prepare_job_config 5 bagging uniform uniform $TREE_METHOD
-prepare_job_config 5 cyclic exponential uniform $TREE_METHOD
-prepare_job_config 5 cyclic uniform uniform $TREE_METHOD
-
-prepare_job_config 20 bagging square scaled $TREE_METHOD
-prepare_job_config 20 bagging square uniform $TREE_METHOD
-prepare_job_config 20 bagging uniform uniform $TREE_METHOD
-prepare_job_config 20 cyclic square uniform $TREE_METHOD
-prepare_job_config 20 cyclic uniform uniform $TREE_METHOD
-
-prepare_job_config 2 histogram uniform uniform $TREE_METHOD
-prepare_job_config 5 histogram uniform uniform $TREE_METHOD
-prepare_job_config 2 histogram_v2 uniform uniform $TREE_METHOD
-prepare_job_config 5 histogram_v2 uniform uniform $TREE_METHOD
-echo "Job configs generated"
diff --git a/examples/advanced/xgboost_secure/requirements.txt b/examples/advanced/xgboost/requirements.txt
similarity index 90%
rename from examples/advanced/xgboost_secure/requirements.txt
rename to examples/advanced/xgboost/requirements.txt
index 2d9890c2c6..95cbefd2e9 100644
--- a/examples/advanced/xgboost_secure/requirements.txt
+++ b/examples/advanced/xgboost/requirements.txt
@@ -1,10 +1,12 @@
-nvflare~=2.5.0rc
-ipcl_python @ git+https://github.com/intel/pailliercryptolib_python.git@development
-# require xgboost 2.2 version, for now need to install a nightly build
-https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/federated-secure/xgboost-2.2.0.dev0%2B4601688195708f7c31fcceeb0e0ac735e7311e61-py3-none-manylinux_2_28_x86_64.whl
+nvflare~=2.5.0
+openmined.psi==1.1.1
 pandas
+torch
 scikit-learn
 shap
 matplotlib
 tensorboard
 tenseal
+# require xgboost 2.2 version, for now need to install a nightly build
+https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/federated-secure/xgboost-2.2.0.dev0%2B4601688195708f7c31fcceeb0e0ac735e7311e61-py3-none-manylinux_2_28_x86_64.whl
+ipcl_python @ git+https://github.com/intel/pailliercryptolib_python.git@development
diff --git a/examples/advanced/xgboost/tree-based/README.md b/examples/advanced/xgboost/tree-based/README.md
deleted file mode 100644
index ddcb545d09..0000000000
--- a/examples/advanced/xgboost/tree-based/README.md
+++ /dev/null
@@ -1,101 +0,0 @@
-# Tree-based Federated Learning for XGBoost   
-
-You can also follow along in this [notebook](./xgboost_tree_higgs.ipynb) for an interactive experience.
-
-## Cyclic Training 
-
-"Cyclic XGBoost" is one way of performing tree-based federated boosting with multiple sites: at each round of tree boosting, instead of relying on the whole data statistics collected from all clients, the boosting relies on only 1 client's local data. The resulting tree sequence is then forwarded to the next client for next round's boosting. Such training scheme have been proposed in literatures [1] [2].
-
-## Bagging Aggregation
-
-"Bagging XGBoost" is another way of performing tree-based federated boosting with multiple sites: at each round of tree boosting, all sites start from the same "global model", and boost a number of trees (in current example, 1 tree) based on their local data. The resulting trees are then send to server. A bagging aggregation scheme is applied to all the submitted trees to update the global model, which is further distributed to all clients for next round's boosting. 
-
-This scheme bears certain similarity to the [Random Forest mode](https://xgboost.readthedocs.io/en/stable/tutorials/rf.html) of XGBoost, where a `num_parallel_tree` is boosted based on random row/col splits, rather than a single tree. Under federated learning setting, such split is fixed to clients rather than random and without column subsampling. 
-
-In addition to basic uniform shrinkage setting where all clients have the same learning rate, based on our research, we enabled scaled shrinkage across clients for weighted aggregation according to each client's data size, which is shown to significantly improve the model's performance on non-uniform quantity splits over HIGGS data.
-
-## Run automated experiments
-Please make sure to finish the [preparation steps](../README.md) before running the following steps.
-To run all experiments in this example with NVFlare, follow the steps below. To try out a single experiment, follow this [notebook](./xgboost_tree_higgs.ipynb).
-
-### Environment Preparation
-
-Switch to this directory and install additional requirements (suggest to do this inside virtual environment):
-```
-python3 -m pip install -r requirements.txt
-```
-
-### Run federated experiments with simulator locally
-Next, we will use the NVFlare simulator to run FL training for all the different experiment configurations.
-```
-bash run_experiment_simulator.sh
-```
-
-### Run centralized experiments
-For comparison, we train baseline models in a centralized manner with same round of training.
-```
-bash run_experiment_centralized.sh
-```
-This will train several models w/ and w/o random forest settings. The results are shown below.
-
-![Centralized validation curve](./figs/Centralized.png)
-
-As shown, random forest may not yield significant performance gain,
-and can even make the accuracy worse if subsample rate is too low (e.g. 0.05).
-
-### Results comparison on 5-client and 20-client under various training settings
-
-Let's then summarize the result of the federated learning experiments run above. We compare the AUC scores of 
-the model on a standalone validation set consisted of the first 1 million instances of HIGGS dataset.
-
-We provide a script for plotting the tensorboard records, running
-```
-python3 ./utils/plot_tensorboard_events.py
-```
-
-> **_NOTE:_** You need to install [./plot-requirements.txt](./plot-requirements.txt) to plot.
-
-
-The resulting validation AUC curves (no smoothing) are shown below:
-
-![5 clients validation curve](./figs/5_client.png)
-![20 clients validation curve](./figs/20_client.png)
-
-As illustrated, we can have the following observations:
-- cyclic training performs ok under uniform split (the purple curve), however under non-uniform split, it will have significant performance drop (the brown curve)
-- bagging training performs better than cyclic under both uniform and non-uniform data splits (orange v.s. purple, red/green v.s. brown)
-- with uniform shrinkage, bagging will have significant performance drop under non-uniform split (green v.s. orange)
-- data-size dependent shrinkage will be able to recover the performance drop above (red v.s. green), and achieve comparable/better performance as uniform data split (red v.s. orange) 
-- bagging under uniform data split (orange), and bagging with data-size dependent shrinkage under non-uniform data split(red), can achieve comparable/better performance as compared with centralized training baseline (blue)
-
-For model size, centralized training and cyclic training will have a model consisting of `num_round` trees,
-while the bagging models consist of `num_round * num_client` trees, since each round,
-bagging training boosts a forest consisting of individually trained trees from each client.
-
-### Run federated experiments in real world
-
-To run in a federated setting, follow [Real-World FL](https://nvflare.readthedocs.io/en/main/real_world_fl.html) to
-start the overseer, FL servers and FL clients.
-
-You need to download the HIGGS data on each client site.
-You will also need to install the xgboost on each client site and server site.
-
-You can still generate the data splits and job configs using the scripts provided.
-
-You will need to copy the generated data split file into each client site.
-You might also need to modify the `data_path` in the `data_site-XXX.json`
-inside the `/tmp/nvflare/xgboost_higgs_dataset` folder,
-since each site might save the HIGGS dataset in different places.
-
-Then you can use admin client to submit the job via `submit_job` command.
-
-## Customization
-
-To use other dataset, can inherit the base class `XGBDataLoader` and
-implement that `load_data()` method.
-
-
-## Reference
-[1] Zhao, L. et al., "InPrivate Digging: Enabling Tree-based Distributed Data Mining with Differential Privacy," IEEE INFOCOM 2018 - IEEE Conference on Computer Communications, 2018, pp. 2087-2095
-
-[2] Yamamoto, F. et al., "New Approaches to Federated XGBoost Learning for Privacy-Preserving Data Analysis," ICONIP 2020 - International Conference on Neural Information Processing, 2020, Lecture Notes in Computer Science, vol 12533 
diff --git a/examples/advanced/xgboost/tree-based/jobs/bagging_base/app/config/config_fed_client.json b/examples/advanced/xgboost/tree-based/jobs/bagging_base/app/config/config_fed_client.json
deleted file mode 100755
index ef0f19875b..0000000000
--- a/examples/advanced/xgboost/tree-based/jobs/bagging_base/app/config/config_fed_client.json
+++ /dev/null
@@ -1,41 +0,0 @@
-{
-  "format_version": 2,
-
-  "executors": [
-    {
-      "tasks": [
-        "train"
-      ],
-      "executor": {
-        "id": "Executor",
-        "path": "nvflare.app_opt.xgboost.tree_based.executor.FedXGBTreeExecutor",
-        "args": {
-          "data_loader_id": "dataloader",
-          "training_mode": "bagging",
-          "num_client_bagging": 5,
-          "num_local_parallel_tree": 1,
-          "local_subsample": 1,
-          "local_model_path": "model.json",
-          "global_model_path": "model_global.json",
-          "learning_rate": 0.1,
-          "objective": "binary:logistic",
-          "max_depth": 8,
-          "eval_metric": "auc",
-          "tree_method": "hist",
-          "nthread": 16
-        }
-      }
-    }
-  ],
-  "task_result_filters": [],
-  "task_data_filters": [],
-  "components": [
-    {
-      "id": "dataloader",
-      "path": "higgs_data_loader.HIGGSDataLoader",
-      "args": {
-        "data_split_filename": "data_split.json"
-      }
-    }
-  ]
-}
diff --git a/examples/advanced/xgboost/tree-based/jobs/bagging_base/app/config/config_fed_server.json b/examples/advanced/xgboost/tree-based/jobs/bagging_base/app/config/config_fed_server.json
deleted file mode 100755
index cfd7b83b54..0000000000
--- a/examples/advanced/xgboost/tree-based/jobs/bagging_base/app/config/config_fed_server.json
+++ /dev/null
@@ -1,48 +0,0 @@
-{
-  "format_version": 2,
-  "num_rounds": 101,
-
-  "task_data_filters": [],
-  "task_result_filters": [],
-
-  "components": [
-    {
-      "id": "persistor",
-      "path": "nvflare.app_opt.xgboost.tree_based.model_persistor.XGBModelPersistor",
-      "args": {
-        "save_name": "xgboost_model.json"
-      }
-    },
-    {
-      "id": "shareable_generator",
-      "path": "nvflare.app_opt.xgboost.tree_based.shareable_generator.XGBModelShareableGenerator",
-      "args": {}
-    },
-    {
-      "id": "aggregator",
-      "path": "nvflare.app_opt.xgboost.tree_based.bagging_aggregator.XGBBaggingAggregator",
-      "args": {}
-    }
-  ],
-  "workflows": [
-    {
-      "id": "scatter_and_gather",
-      "path": "nvflare.app_common.workflows.scatter_and_gather.ScatterAndGather",
-      "args": {
-        "min_clients": 5,
-        "num_rounds": "{num_rounds}",
-        "start_round": 0,
-        "wait_time_after_min_received": 0,
-        "aggregator_id": "aggregator",
-        "persistor_id": "persistor",
-        "shareable_generator_id": "shareable_generator",
-        "train_task_name": "train",
-        "train_timeout": 0,
-        "allow_empty_global_weights": true,
-        "task_check_period": 0.01,
-        "persist_every_n_rounds": 0,
-        "snapshot_every_n_rounds": 0
-      }
-    }
-  ]
-}
diff --git a/examples/advanced/xgboost/tree-based/jobs/bagging_base/app/custom/higgs_data_loader.py b/examples/advanced/xgboost/tree-based/jobs/bagging_base/app/custom/higgs_data_loader.py
deleted file mode 100644
index 124268cfce..0000000000
--- a/examples/advanced/xgboost/tree-based/jobs/bagging_base/app/custom/higgs_data_loader.py
+++ /dev/null
@@ -1,77 +0,0 @@
-# Copyright (c) 2023, NVIDIA CORPORATION.  All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import json
-
-import pandas as pd
-import xgboost as xgb
-
-from nvflare.app_opt.xgboost.data_loader import XGBDataLoader
-
-
-def _read_higgs_with_pandas(data_path, start: int, end: int):
-    data_size = end - start
-    data = pd.read_csv(data_path, header=None, skiprows=start, nrows=data_size)
-    data_num = data.shape[0]
-
-    # split to feature and label
-    x = data.iloc[:, 1:].copy()
-    y = data.iloc[:, 0].copy()
-
-    return x, y, data_num
-
-
-class HIGGSDataLoader(XGBDataLoader):
-    def __init__(self, data_split_filename):
-        """Reads HIGGS dataset and return XGB data matrix.
-
-        Args:
-            data_split_filename: file name to data splits
-        """
-        self.data_split_filename = data_split_filename
-
-    def load_data(self):
-        with open(self.data_split_filename, "r") as file:
-            data_split = json.load(file)
-
-        data_path = data_split["data_path"]
-        data_index = data_split["data_index"]
-
-        # check if site_id and "valid" in the mapping dict
-        if self.client_id not in data_index.keys():
-            raise ValueError(
-                f"Data does not contain Client {self.client_id} split",
-            )
-
-        if "valid" not in data_index.keys():
-            raise ValueError(
-                "Data does not contain Validation split",
-            )
-
-        site_index = data_index[self.client_id]
-        valid_index = data_index["valid"]
-
-        # training
-        x_train, y_train, total_train_data_num = _read_higgs_with_pandas(
-            data_path=data_path, start=site_index["start"], end=site_index["end"]
-        )
-        dmat_train = xgb.DMatrix(x_train, label=y_train)
-
-        # validation
-        x_valid, y_valid, total_valid_data_num = _read_higgs_with_pandas(
-            data_path=data_path, start=valid_index["start"], end=valid_index["end"]
-        )
-        dmat_valid = xgb.DMatrix(x_valid, label=y_valid, data_split_mode=self.data_split_mode)
-
-        return dmat_train, dmat_valid
diff --git a/examples/advanced/xgboost/tree-based/jobs/bagging_base/meta.json b/examples/advanced/xgboost/tree-based/jobs/bagging_base/meta.json
deleted file mode 100644
index aa7ac49fd6..0000000000
--- a/examples/advanced/xgboost/tree-based/jobs/bagging_base/meta.json
+++ /dev/null
@@ -1,9 +0,0 @@
-{
-  "name": "xgboost_tree_bagging",
-  "resource_spec": {},
-  "deploy_map": {
-    "app": [
-      "@ALL"
-    ]
-  }
-}
diff --git a/examples/advanced/xgboost/tree-based/jobs/cyclic_base/app/config/config_fed_client.json b/examples/advanced/xgboost/tree-based/jobs/cyclic_base/app/config/config_fed_client.json
deleted file mode 100755
index d63a3ea551..0000000000
--- a/examples/advanced/xgboost/tree-based/jobs/cyclic_base/app/config/config_fed_client.json
+++ /dev/null
@@ -1,39 +0,0 @@
-{
-  "format_version": 2,
-
-  "executors": [
-    {
-      "tasks": [
-        "train"
-      ],
-      "executor": {
-        "id": "Executor",
-        "path": "nvflare.app_opt.xgboost.tree_based.executor.FedXGBTreeExecutor",
-        "args": {
-          "data_loader_id": "dataloader",
-          "training_mode": "cyclic",
-          "num_client_bagging": 1,
-          "local_model_path": "model.json",
-          "global_model_path": "model_global.json",
-          "learning_rate": 0.1,
-          "objective": "binary:logistic",
-          "max_depth": 8,
-          "eval_metric": "auc",
-          "tree_method": "hist",
-          "nthread": 16
-        }
-      }
-    }
-  ],
-  "task_result_filters": [],
-  "task_data_filters": [],
-  "components": [
-    {
-      "id": "dataloader",
-      "path": "higgs_data_loader.HIGGSDataLoader",
-      "args": {
-        "data_split_filename": "data_split.json"
-      }
-    }
-  ]
-}
diff --git a/examples/advanced/xgboost/tree-based/jobs/cyclic_base/app/config/config_fed_server.json b/examples/advanced/xgboost/tree-based/jobs/cyclic_base/app/config/config_fed_server.json
deleted file mode 100755
index 93a8e3cf4b..0000000000
--- a/examples/advanced/xgboost/tree-based/jobs/cyclic_base/app/config/config_fed_server.json
+++ /dev/null
@@ -1,38 +0,0 @@
-{
-  "format_version": 2,
-  "num_rounds": 20,
-  "task_data_filters": [],
-  "task_result_filters": [],
-
-  "components": [
-    {
-      "id": "persistor",
-      "path": "nvflare.app_opt.xgboost.tree_based.model_persistor.XGBModelPersistor",
-      "args": {
-        "save_name": "xgboost_model.json",
-        "load_as_dict": false
-      }
-    },
-    {
-      "id": "shareable_generator",
-      "path": "nvflare.app_opt.xgboost.tree_based.shareable_generator.XGBModelShareableGenerator",
-      "args": {}
-    }
-  ],
-  "workflows": [
-    {
-      "id": "cyclic_ctl",
-      "path": "nvflare.app_common.workflows.cyclic_ctl.CyclicController",
-      "args": {
-        "num_rounds": "{num_rounds}",
-        "task_assignment_timeout": 60,
-        "persistor_id": "persistor",
-        "shareable_generator_id": "shareable_generator",
-        "task_name": "train",
-        "task_check_period": 0.01,
-        "persist_every_n_rounds": 0,
-        "snapshot_every_n_rounds": 0
-      }
-    }
-  ]
-}
diff --git a/examples/advanced/xgboost/tree-based/jobs/cyclic_base/app/custom/higgs_data_loader.py b/examples/advanced/xgboost/tree-based/jobs/cyclic_base/app/custom/higgs_data_loader.py
deleted file mode 100644
index 124268cfce..0000000000
--- a/examples/advanced/xgboost/tree-based/jobs/cyclic_base/app/custom/higgs_data_loader.py
+++ /dev/null
@@ -1,77 +0,0 @@
-# Copyright (c) 2023, NVIDIA CORPORATION.  All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import json
-
-import pandas as pd
-import xgboost as xgb
-
-from nvflare.app_opt.xgboost.data_loader import XGBDataLoader
-
-
-def _read_higgs_with_pandas(data_path, start: int, end: int):
-    data_size = end - start
-    data = pd.read_csv(data_path, header=None, skiprows=start, nrows=data_size)
-    data_num = data.shape[0]
-
-    # split to feature and label
-    x = data.iloc[:, 1:].copy()
-    y = data.iloc[:, 0].copy()
-
-    return x, y, data_num
-
-
-class HIGGSDataLoader(XGBDataLoader):
-    def __init__(self, data_split_filename):
-        """Reads HIGGS dataset and return XGB data matrix.
-
-        Args:
-            data_split_filename: file name to data splits
-        """
-        self.data_split_filename = data_split_filename
-
-    def load_data(self):
-        with open(self.data_split_filename, "r") as file:
-            data_split = json.load(file)
-
-        data_path = data_split["data_path"]
-        data_index = data_split["data_index"]
-
-        # check if site_id and "valid" in the mapping dict
-        if self.client_id not in data_index.keys():
-            raise ValueError(
-                f"Data does not contain Client {self.client_id} split",
-            )
-
-        if "valid" not in data_index.keys():
-            raise ValueError(
-                "Data does not contain Validation split",
-            )
-
-        site_index = data_index[self.client_id]
-        valid_index = data_index["valid"]
-
-        # training
-        x_train, y_train, total_train_data_num = _read_higgs_with_pandas(
-            data_path=data_path, start=site_index["start"], end=site_index["end"]
-        )
-        dmat_train = xgb.DMatrix(x_train, label=y_train)
-
-        # validation
-        x_valid, y_valid, total_valid_data_num = _read_higgs_with_pandas(
-            data_path=data_path, start=valid_index["start"], end=valid_index["end"]
-        )
-        dmat_valid = xgb.DMatrix(x_valid, label=y_valid, data_split_mode=self.data_split_mode)
-
-        return dmat_train, dmat_valid
diff --git a/examples/advanced/xgboost/tree-based/jobs/cyclic_base/meta.json b/examples/advanced/xgboost/tree-based/jobs/cyclic_base/meta.json
deleted file mode 100644
index 58450dbfdd..0000000000
--- a/examples/advanced/xgboost/tree-based/jobs/cyclic_base/meta.json
+++ /dev/null
@@ -1,9 +0,0 @@
-{
-  "name": "xgboost_tree_cyclic",
-  "resource_spec": {},
-  "deploy_map": {
-    "app": [
-      "@ALL"
-    ]
-  }
-}
diff --git a/examples/advanced/xgboost/tree-based/plot-requirements.txt b/examples/advanced/xgboost/tree-based/plot-requirements.txt
deleted file mode 100644
index 7262e63060..0000000000
--- a/examples/advanced/xgboost/tree-based/plot-requirements.txt
+++ /dev/null
@@ -1,2 +0,0 @@
-tensorflow
-seaborn
diff --git a/examples/advanced/xgboost/tree-based/prepare_data.sh b/examples/advanced/xgboost/tree-based/prepare_data.sh
deleted file mode 100755
index f7bdf9e68d..0000000000
--- a/examples/advanced/xgboost/tree-based/prepare_data.sh
+++ /dev/null
@@ -1,5 +0,0 @@
-#!/usr/bin/env bash
-
-SCRIPT_DIR="$( dirname -- "$0"; )";
-
-bash "${SCRIPT_DIR}"/../prepare_data.sh
diff --git a/examples/advanced/xgboost/tree-based/requirements.txt b/examples/advanced/xgboost/tree-based/requirements.txt
deleted file mode 100644
index d79a5bef89..0000000000
--- a/examples/advanced/xgboost/tree-based/requirements.txt
+++ /dev/null
@@ -1,9 +0,0 @@
-nvflare~=2.5.0rc
-pandas
-scikit-learn
-torch
-tensorboard
-matplotlib
-shap
-# require xgboost 2.2 version, for now need to install a nightly build
-https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/federated-secure/xgboost-2.2.0.dev0%2B4601688195708f7c31fcceeb0e0ac735e7311e61-py3-none-manylinux_2_28_x86_64.whl
diff --git a/examples/advanced/xgboost/tree-based/run_experiment_centralized.sh b/examples/advanced/xgboost/tree-based/run_experiment_centralized.sh
deleted file mode 100755
index 83cfa81162..0000000000
--- a/examples/advanced/xgboost/tree-based/run_experiment_centralized.sh
+++ /dev/null
@@ -1,13 +0,0 @@
-#!/usr/bin/env bash
-DATASET_PATH="$HOME/dataset/HIGGS.csv"
-
-if [ ! -f "${DATASET_PATH}" ]
-then
-    echo "Please check if you saved HIGGS dataset in ${DATASET_PATH}"
-fi
-
-python3 ../utils/baseline_centralized.py --num_parallel_tree 1 --data_path "${DATASET_PATH}"
-python3 ../utils/baseline_centralized.py --num_parallel_tree 5 --subsample 0.8 --data_path "${DATASET_PATH}"
-python3 ../utils/baseline_centralized.py --num_parallel_tree 5 --subsample 0.2 --data_path "${DATASET_PATH}"
-python3 ../utils/baseline_centralized.py --num_parallel_tree 20 --subsample 0.05 --data_path "${DATASET_PATH}"
-python3 ../utils/baseline_centralized.py --num_parallel_tree 20 --subsample 0.8 --data_path "${DATASET_PATH}"
diff --git a/examples/advanced/xgboost/tree-based/run_experiment_simulator.sh b/examples/advanced/xgboost/tree-based/run_experiment_simulator.sh
deleted file mode 100755
index 05b2a050e7..0000000000
--- a/examples/advanced/xgboost/tree-based/run_experiment_simulator.sh
+++ /dev/null
@@ -1,22 +0,0 @@
-#!/usr/bin/env bash
-
-n=5
-for study in bagging_uniform_split_uniform_lr \
-             bagging_exponential_split_uniform_lr \
-             bagging_exponential_split_scaled_lr \
-             cyclic_uniform_split_uniform_lr \
-             cyclic_exponential_split_uniform_lr
-do
-  nvflare simulator jobs/higgs_${n}_${study} -w ${PWD}/workspaces/xgboost_workspace_${n}_${study} -n ${n} -t ${n}
-done
-
-
-n=20
-for study in bagging_uniform_split_uniform_lr \
-            bagging_square_split_uniform_lr \
-            bagging_square_split_scaled_lr \
-            cyclic_uniform_split_uniform_lr \
-            cyclic_square_split_uniform_lr
-do
-  nvflare simulator jobs/higgs_${n}_${study} -w ${PWD}/workspaces/xgboost_workspace_${n}_${study} -n ${n} -t ${n}
-done
diff --git a/examples/advanced/xgboost/tree-based/utils/plot_tensorboard_events.py b/examples/advanced/xgboost/tree-based/utils/plot_tensorboard_events.py
deleted file mode 100644
index bc6953f274..0000000000
--- a/examples/advanced/xgboost/tree-based/utils/plot_tensorboard_events.py
+++ /dev/null
@@ -1,136 +0,0 @@
-# Copyright (c) 2023, NVIDIA CORPORATION.  All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import glob
-import os
-
-import matplotlib.pyplot as plt
-import seaborn as sns
-import tensorflow as tf
-
-# simulator workspace
-client_results_root = "./workspaces/xgboost_workspace_"
-client_num_list = [5, 20]
-client_pre = "app_site-"
-centralized_path = "./workspaces/centralized_1_1/events.*"
-
-# bagging and cyclic need different handle
-experiments_bagging = {
-    5: {
-        "5_bagging_uniform_split_uniform_lr": {"tag": "AUC"},
-        "5_bagging_exponential_split_uniform_lr": {"tag": "AUC"},
-        "5_bagging_exponential_split_scaled_lr": {"tag": "AUC"},
-    },
-    20: {
-        "20_bagging_uniform_split_uniform_lr": {"tag": "AUC"},
-        "20_bagging_square_split_uniform_lr": {"tag": "AUC"},
-        "20_bagging_square_split_scaled_lr": {"tag": "AUC"},
-    },
-}
-experiments_cyclic = {
-    5: {
-        "5_cyclic_uniform_split_uniform_lr": {"tag": "AUC"},
-        "5_cyclic_exponential_split_uniform_lr": {"tag": "AUC"},
-    },
-    20: {
-        "20_cyclic_uniform_split_uniform_lr": {"tag": "AUC"},
-        "20_cyclic_square_split_uniform_lr": {"tag": "AUC"},
-    },
-}
-
-weight = 0.0
-
-
-def smooth(scalars, weight):  # Weight between 0 and 1
-    last = scalars[0]  # First value in the plot (first timestep)
-    smoothed = list()
-    for point in scalars:
-        smoothed_val = last * weight + (1 - weight) * point  # Calculate smoothed value
-        smoothed.append(smoothed_val)  # Save it
-        last = smoothed_val  # Anchor the last smoothed value
-    return smoothed
-
-
-def read_eventfile(filepath, tags=["AUC"]):
-    data = {}
-    for summary in tf.compat.v1.train.summary_iterator(filepath):
-        for v in summary.summary.value:
-            if v.tag in tags:
-                if v.tag in data.keys():
-                    data[v.tag].append([summary.step, v.simple_value])
-                else:
-                    data[v.tag] = [[summary.step, v.simple_value]]
-    return data
-
-
-def add_eventdata(data, config, filepath, tag="AUC"):
-    event_data = read_eventfile(filepath, tags=[tag])
-    assert len(event_data[tag]) > 0, f"No data for key {tag}"
-
-    metric = []
-    for e in event_data[tag]:
-        # print(e)
-        data["Config"].append(config)
-        data["Round"].append(e[0])
-        metric.append(e[1])
-
-    metric = smooth(metric, weight)
-    for entry in metric:
-        data["AUC"].append(entry)
-
-    print(f"added {len(event_data[tag])} entries for {tag}")
-
-
-def main():
-    plt.figure()
-
-    for client_num in client_num_list:
-        plt.figure
-        plt.title(f"{client_num} client experiments")
-        # add event files
-        data = {"Config": [], "Round": [], "AUC": []}
-        # add centralized result
-        eventfile = glob.glob(centralized_path, recursive=True)
-        assert len(eventfile) == 1, "No unique event file found!" + eventfile
-        eventfile = eventfile[0]
-        print("adding", eventfile)
-        add_eventdata(data, "centralized", eventfile, tag="AUC")
-        # pick first client for bagging experiments
-        site = 1
-        for config, exp in experiments_bagging[client_num].items():
-            record_path = os.path.join(client_results_root + config, "simulate_job", client_pre + str(site), "events.*")
-            eventfile = glob.glob(record_path, recursive=True)
-            assert len(eventfile) == 1, "No unique event file found!"
-            eventfile = eventfile[0]
-            print("adding", eventfile)
-            add_eventdata(data, config, eventfile, tag=exp["tag"])
-
-        # Combine all clients' records for cyclic experiments
-        for site in range(1, client_num + 1):
-            for config, exp in experiments_cyclic[client_num].items():
-                record_path = os.path.join(
-                    client_results_root + config, "simulate_job", client_pre + str(site), "events.*"
-                )
-                eventfile = glob.glob(record_path, recursive=True)
-                assert len(eventfile) == 1, f"No unique event file found under {record_path}!"
-                eventfile = eventfile[0]
-                print("adding", eventfile)
-                add_eventdata(data, config, eventfile, tag=exp["tag"])
-
-        sns.lineplot(x="Round", y="AUC", hue="Config", data=data)
-        plt.show()
-
-
-if __name__ == "__main__":
-    main()
diff --git a/examples/advanced/xgboost/utils/prepare_job_config.py b/examples/advanced/xgboost/utils/prepare_job_config.py
deleted file mode 100644
index c7339391ab..0000000000
--- a/examples/advanced/xgboost/utils/prepare_job_config.py
+++ /dev/null
@@ -1,239 +0,0 @@
-# Copyright (c) 2023, NVIDIA CORPORATION.  All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import argparse
-import json
-import os
-import pathlib
-import shutil
-
-from nvflare.apis.fl_constant import JobConstants
-
-SCRIPT_PATH = pathlib.Path(os.path.realpath(__file__))
-XGB_EXAMPLE_ROOT = SCRIPT_PATH.parent.parent.absolute()
-JOB_CONFIGS_ROOT = "jobs"
-ALGO_DIR_MAP = {
-    "bagging": "tree-based",
-    "cyclic": "tree-based",
-    "histogram": "histogram-based",
-    "histogram_v2": "histogram-based",
-}
-BASE_JOB_MAP = {"bagging": "bagging_base", "cyclic": "cyclic_base", "histogram": "base", "histogram_v2": "base_v2"}
-
-
-def job_config_args_parser():
-    parser = argparse.ArgumentParser(description="generate train configs for HIGGS dataset")
-    parser.add_argument(
-        "--data_root",
-        type=str,
-        default="/tmp/nvflare/xgboost_higgs_dataset",
-        help="Path to dataset config files for each site",
-    )
-    parser.add_argument("--site_num", type=int, default=5, help="Total number of sites")
-    parser.add_argument("--site_name_prefix", type=str, default="site-", help="Site name prefix")
-    parser.add_argument("--round_num", type=int, default=100, help="Total number of training rounds")
-    parser.add_argument(
-        "--training_algo", type=str, default="bagging", choices=list(ALGO_DIR_MAP.keys()), help="Training algorithm"
-    )
-    parser.add_argument("--split_method", type=str, default="uniform", help="How to split the dataset")
-    parser.add_argument("--lr_mode", type=str, default="uniform", help="Whether to use uniform or scaled shrinkage")
-    parser.add_argument("--nthread", type=int, default=16, help="nthread for xgboost")
-    parser.add_argument(
-        "--tree_method", type=str, default="hist", help="tree_method for xgboost - use hist for best perf"
-    )
-    parser.add_argument("--data_split_mode", type=int, default=0, help="dataset split mode, 0 or 1")
-    parser.add_argument("--secure_training", type=bool, default=False, help="histogram_v2 secure training or not")
-    return parser
-
-
-def _read_json(filename):
-    if not os.path.isfile(filename):
-        raise ValueError(f"{filename} does not exist!")
-    with open(filename, "r") as f:
-        return json.load(f)
-
-
-def _write_json(data, filename):
-    with open(filename, "w") as f:
-        json.dump(data, f, indent=4)
-
-
-def _get_job_name(args) -> str:
-    return (
-        "higgs_"
-        + str(args.site_num)
-        + "_"
-        + args.training_algo
-        + "_"
-        + args.split_method
-        + "_split"
-        + "_"
-        + args.lr_mode
-        + "_lr"
-    )
-
-
-def _get_data_split_name(args, site_name: str) -> str:
-    return os.path.join(args.data_root, f"{args.site_num}_{args.split_method}", f"data_{site_name}.json")
-
-
-def _get_src_job_dir(training_algo):
-    return XGB_EXAMPLE_ROOT / ALGO_DIR_MAP[training_algo] / JOB_CONFIGS_ROOT / BASE_JOB_MAP[training_algo]
-
-
-def _gen_deploy_map(num_sites: int, site_name_prefix: str) -> dict:
-    deploy_map = {"app_server": ["server"]}
-    for i in range(1, num_sites + 1):
-        deploy_map[f"app_{site_name_prefix}{i}"] = [f"{site_name_prefix}{i}"]
-    return deploy_map
-
-
-def _update_meta(meta: dict, args):
-    name = _get_job_name(args)
-    meta["name"] = name
-    meta["deploy_map"] = _gen_deploy_map(args.site_num, args.site_name_prefix)
-    meta["min_clients"] = args.site_num
-
-
-def _get_lr_scale_from_split_json(data_split: dict):
-    split = {}
-    total_data_num = 0
-    for k, v in data_split["data_index"].items():
-        if k == "valid":
-            continue
-        data_num = int(v["end"] - v["start"])
-        total_data_num += data_num
-        split[k] = data_num
-
-    lr_scales = {}
-    for k in split:
-        lr_scales[k] = split[k] / total_data_num
-
-    return lr_scales
-
-
-def _update_client_config(config: dict, args, lr_scale, site_name: str):
-    data_split_name = _get_data_split_name(args, site_name)
-    if args.training_algo == "bagging" or args.training_algo == "cyclic":
-        # update client config
-        config["executors"][0]["executor"]["args"]["lr_scale"] = lr_scale
-        config["executors"][0]["executor"]["args"]["lr_mode"] = args.lr_mode
-        config["executors"][0]["executor"]["args"]["nthread"] = args.nthread
-        config["executors"][0]["executor"]["args"]["tree_method"] = args.tree_method
-        config["executors"][0]["executor"]["args"]["training_mode"] = args.training_algo
-        num_client_bagging = 1
-        if args.training_algo == "bagging":
-            num_client_bagging = args.site_num
-        config["executors"][0]["executor"]["args"]["num_client_bagging"] = num_client_bagging
-    elif args.training_algo == "histogram":
-        config["num_rounds"] = args.round_num
-        config["executors"][0]["executor"]["args"]["xgb_params"]["nthread"] = args.nthread
-        config["executors"][0]["executor"]["args"]["xgb_params"]["tree_method"] = args.tree_method
-    config["components"][0]["args"]["data_split_filename"] = data_split_name
-
-
-def _update_server_config(config: dict, args):
-    if args.training_algo == "bagging":
-        config["num_rounds"] = args.round_num + 1
-        config["workflows"][0]["args"]["min_clients"] = args.site_num
-    elif args.training_algo == "cyclic":
-        config["num_rounds"] = int(args.round_num / args.site_num)
-    elif args.training_algo == "histogram_v2":
-        config["num_rounds"] = args.round_num
-        config["workflows"][0]["args"]["xgb_params"]["nthread"] = args.nthread
-        config["workflows"][0]["args"]["xgb_params"]["tree_method"] = args.tree_method
-        config["workflows"][0]["args"]["data_split_mode"] = args.data_split_mode
-        config["workflows"][0]["args"]["secure_training"] = args.secure_training
-
-
-def _copy_custom_files(src_job_path, src_app_name, dst_job_path, dst_app_name):
-    dst_path = dst_job_path / dst_app_name / "custom"
-    os.makedirs(dst_path, exist_ok=True)
-    src_path = src_job_path / src_app_name / "custom"
-    if os.path.isdir(src_path):
-        shutil.copytree(src_path, dst_path, dirs_exist_ok=True)
-
-
-def create_server_app(src_job_path, src_app_name, dst_job_path, site_name, args):
-    dst_app_name = f"app_{site_name}"
-    server_config = _read_json(src_job_path / src_app_name / "config" / JobConstants.SERVER_JOB_CONFIG)
-    dst_config_path = dst_job_path / dst_app_name / "config"
-
-    # make target config folders
-    if not os.path.exists(dst_config_path):
-        os.makedirs(dst_config_path)
-
-    _update_server_config(server_config, args)
-    server_config_filename = dst_config_path / JobConstants.SERVER_JOB_CONFIG
-    _write_json(server_config, server_config_filename)
-
-
-def create_client_app(src_job_path, src_app_name, dst_job_path, site_name, args):
-    dst_app_name = f"app_{site_name}"
-    client_config = _read_json(src_job_path / src_app_name / "config" / JobConstants.CLIENT_JOB_CONFIG)
-    dst_config_path = dst_job_path / dst_app_name / "config"
-
-    # make target config folders
-    if not os.path.exists(dst_config_path):
-        os.makedirs(dst_config_path)
-
-    # get lr scale
-    data_split_name = _get_data_split_name(args, site_name)
-    data_split = _read_json(data_split_name)
-    lr_scales = _get_lr_scale_from_split_json(data_split)
-
-    # adjust file contents according to each job's specs
-    _update_client_config(client_config, args, lr_scales[site_name], site_name)
-    client_config_filename = dst_config_path / JobConstants.CLIENT_JOB_CONFIG
-    _write_json(client_config, client_config_filename)
-
-    # copy custom file
-    _copy_custom_files(src_job_path, src_app_name, dst_job_path, dst_app_name)
-
-
-def main():
-    parser = job_config_args_parser()
-    args = parser.parse_args()
-    job_name = _get_job_name(args)
-    src_job_path = _get_src_job_dir(args.training_algo)
-
-    # create a new job
-    dst_job_path = XGB_EXAMPLE_ROOT / ALGO_DIR_MAP[args.training_algo] / JOB_CONFIGS_ROOT / job_name
-    if not os.path.exists(dst_job_path):
-        os.makedirs(dst_job_path)
-
-    # update meta
-    meta_config_dst = dst_job_path / JobConstants.META_FILE
-    meta_config = _read_json(src_job_path / JobConstants.META_FILE)
-    _update_meta(meta_config, args)
-    _write_json(meta_config, meta_config_dst)
-
-    # create server side app
-    create_server_app(
-        src_job_path=src_job_path, src_app_name="app", dst_job_path=dst_job_path, site_name="server", args=args
-    )
-
-    # create client side app
-    for i in range(1, args.site_num + 1):
-        create_client_app(
-            src_job_path=src_job_path,
-            src_app_name="app",
-            dst_job_path=dst_job_path,
-            site_name=f"{args.site_name_prefix}{i}",
-            args=args,
-        )
-
-
-if __name__ == "__main__":
-    main()
diff --git a/nvflare/app_opt/lightning/api.py b/nvflare/app_opt/lightning/api.py
index 4e674e5915..45629a6b42 100644
--- a/nvflare/app_opt/lightning/api.py
+++ b/nvflare/app_opt/lightning/api.py
@@ -12,6 +12,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+import logging
 from typing import Dict
 
 import pytorch_lightning as pl
@@ -29,7 +30,9 @@
 FL_META_KEY = "__fl_meta__"
 
 
-def patch(trainer: pl.Trainer, restore_state: bool = True, load_state_dict_strict: bool = True):
+def patch(
+    trainer: pl.Trainer, restore_state: bool = True, load_state_dict_strict: bool = True, update_fit_loop: bool = True
+):
     """Patches the PyTorch Lightning Trainer for usage with NVFlare.
 
     Args:
@@ -39,6 +42,8 @@ def patch(trainer: pl.Trainer, restore_state: bool = True, load_state_dict_stric
         load_state_dict_strict: exposes `strict` argument of `torch.nn.Module.load_state_dict()`
             used to load the received model. Defaults to `True`.
             See https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.load_state_dict for details.
+        update_fit_loop: whether to increase `trainer.fit_loop.max_epochs` and `trainer.fit_loop.epoch_loop.max_steps` each FL round.
+            Defaults to `True` which is suitable for most PyTorch Lightning applications.
 
     Example:
 
@@ -75,7 +80,9 @@ def __init__(self):
         callbacks = []
 
     if not any(isinstance(cb, FLCallback) for cb in callbacks):
-        fl_callback = FLCallback(rank=trainer.global_rank, load_state_dict_strict=load_state_dict_strict)
+        fl_callback = FLCallback(
+            rank=trainer.global_rank, load_state_dict_strict=load_state_dict_strict, update_fit_loop=update_fit_loop
+        )
         callbacks.append(fl_callback)
 
     if restore_state and not any(isinstance(cb, RestoreState) for cb in callbacks):
@@ -85,7 +92,7 @@ def __init__(self):
 
 
 class FLCallback(Callback):
-    def __init__(self, rank: int = 0, load_state_dict_strict: bool = True):
+    def __init__(self, rank: int = 0, load_state_dict_strict: bool = True, update_fit_loop: bool = True):
         """FL callback for lightning API.
 
         Args:
@@ -93,6 +100,8 @@ def __init__(self, rank: int = 0, load_state_dict_strict: bool = True):
             load_state_dict_strict: exposes `strict` argument of `torch.nn.Module.load_state_dict()`
                 used to load the received model. Defaults to `True`.
                 See https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.load_state_dict for details.
+            update_fit_loop: whether to increase `trainer.fit_loop.max_epochs` and `trainer.fit_loop.epoch_loop.max_steps` each FL round.
+                Defaults to `True` which is suitable for most PyTorch Lightning applications.
         """
         super(FLCallback, self).__init__()
         init(rank=str(rank))
@@ -108,6 +117,9 @@ def __init__(self, rank: int = 0, load_state_dict_strict: bool = True):
         self._is_evaluation = False
         self._is_submit_model = False
         self._load_state_dict_strict = load_state_dict_strict
+        self._update_fit_loop = update_fit_loop
+
+        self.logger = logging.getLogger(self.__class__.__name__)
 
     def reset_state(self, trainer):
         """Resets the state.
@@ -130,10 +142,12 @@ def reset_state(self, trainer):
 
             # for next round
             trainer.num_sanity_val_steps = 0  # Turn off sanity validation steps in following rounds of FL
-            if self.total_local_epochs and self.max_epochs_per_round is not None:
-                trainer.fit_loop.max_epochs = self.max_epochs_per_round + self.total_local_epochs
-            if self.total_local_steps and self.max_steps_per_round is not None:
-                trainer.fit_loop.epoch_loop.max_steps = self.max_steps_per_round + self.total_local_steps
+
+            if self._update_fit_loop:
+                if self.total_local_epochs and self.max_epochs_per_round is not None:
+                    trainer.fit_loop.max_epochs = self.max_epochs_per_round + self.total_local_epochs
+                if self.total_local_steps and self.max_steps_per_round is not None:
+                    trainer.fit_loop.epoch_loop.max_steps = self.max_steps_per_round + self.total_local_steps
 
         # resets attributes
         self.metrics = None
@@ -184,7 +198,15 @@ def _receive_and_update_model(self, trainer, pl_module):
         model = self._receive_model(trainer)
         if model:
             if model.params:
-                pl_module.load_state_dict(model.params, strict=self._load_state_dict_strict)
+                missing_keys, unexpected_keys = pl_module.load_state_dict(
+                    model.params, strict=self._load_state_dict_strict
+                )
+                if len(missing_keys) > 0:
+                    self.logger.warning(f"There were missing keys when loading the global state_dict: {missing_keys}")
+                if len(unexpected_keys) > 0:
+                    self.logger.warning(
+                        f"There were unexpected keys when loading the global state_dict: {unexpected_keys}"
+                    )
             if model.current_round is not None:
                 self.current_round = model.current_round
 
diff --git a/nvflare/app_opt/psi/dh_psi/dh_psi_task_handler.py b/nvflare/app_opt/psi/dh_psi/dh_psi_task_handler.py
index cc433954ea..5d84224534 100644
--- a/nvflare/app_opt/psi/dh_psi/dh_psi_task_handler.py
+++ b/nvflare/app_opt/psi/dh_psi/dh_psi_task_handler.py
@@ -49,6 +49,8 @@ def __init__(self, local_psi_id: str):
         self.local_psi_handler: Optional[PSI] = None
         self.client_name = None
         self.items = None
+        # needed by JobAPI, add the following line to the constructor
+        self.local_psi_id = local_psi_id
 
     def initialize(self, fl_ctx: FLContext):
         super().initialize(fl_ctx)