Skip to content

Commit

Permalink
Merge branch 'NVIDIA:main' into gnn_enc
Browse files Browse the repository at this point in the history
  • Loading branch information
ZiyueXu77 authored Oct 14, 2024
2 parents a35ae31 + a5a6353 commit 9d7e125
Show file tree
Hide file tree
Showing 28 changed files with 826 additions and 780 deletions.
5 changes: 3 additions & 2 deletions docs/programming_guide/controllers/model_controller.rst
Original file line number Diff line number Diff line change
Expand Up @@ -228,8 +228,9 @@ For example we can use PyTorch's save and load functions for the model parameter
return model
Note: for non-primitive data types such as ``torch.nn.Module`` (used for the initial PyTorch model), we must configure a corresponding FOBS decomposer for serialization and deserialization.
Read more at :github_nvflare_link:`Flare Object Serializer (FOBS) <nvflare/fuel/utils/fobs/README.rst>`.
Note: for non-primitive data types such as ``torch.nn.Module`` (used for the initial PyTorch model),
we must configure a corresponding FOBS decomposer for serialization and deserialization.
Read more at :ref:`serialization`.

.. code-block:: python
Expand Down
17 changes: 17 additions & 0 deletions docs/programming_guide/execution_api_type/client_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,24 @@ Client API configuration. You can further nail down the selection by choice of
in-process or not, type of models ( GNN, NeMo LLM), workflow patterns ( Swarm learning or standard fedavg with scatter and gather (sag)) etc.


Custom Data Class Serialization/Deserialization
===============================================

To pass data in the form of a custom class, you can leverage the serialization tool inside NVFlare.

For example:

.. code-block:: python
class CustomClass:
def __init__(self, x, y):
self.x = 1
self.y = 2
If you are using classes derived from ``Enum`` or dataclass, they will be handled by the default decomposers.
For other custom classes, you will need to write a dedicated custom decomposer and ensure it is registered
using fobs.register on both the server side and client side, as well as in train.py.

Please note that for the custom data class to work, it must be placed in a separate file from train.py.

For more details on serialization, please refer to :ref:`serialization`.
54 changes: 49 additions & 5 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,17 @@ Note on branches:

* The 2.1, 2.2, 2.3, 2.4, 2.5, etc. branches are the branches for each major release and there are tags based on these with a third digit for minor patches

Install NVFlare from source
----------------------------

Navigate to the NVFlare repository and use pip install with development mode (can be useful to access latest nightly features or test custom builds for example):

.. code-block:: shell
$ git clone https://github.com/NVIDIA/NVFlare.git
$ cd NVFlare
$ pip install -e .
.. _containerized_deployment:

Expand Down Expand Up @@ -197,11 +208,30 @@ Production mode is secure with TLS certificates - depending the choice the deplo

Using non-HA, secure, local mode (all clients and server running on the same host), production mode is very similar to POC mode except it is secure.

Which mode should I choose for running NVFLARE?

- For a quick research run, use the FL Simulator
- For simulating real cases within the same machine, use POC or production (local, non-HA, secure) mode. POC has convenient ``nvflare poc`` commands for ease of use.
- For all other cases, use production mode.
Which mode should I choose for running NVFLARE? (Note: the same jobs can be run in any of the modes, and the same project.yml deployment options can be run in both POC mode and production.)

.. list-table:: NVIDIA FLARE Modes
:header-rows: 1

* - **Mode**
- **Documentation**
- **Description**
* - Simulator
- :ref:`fl_simulator`
- | The FL Simulator is a light weight simulation where the job run is automated on a
| single system. Useful for quickly running a job or experimenting with research
| or FL algorithms.
* - POC
- :ref:`poc_command`
- | POC mode establishes and connects distinct server and client "systems" which can
| then be orchestrated using the FLARE Console all from a single machine. Users can
| also experiment with various deployment options (project.yml), which can be used
| in production modes.
* - Production
- :ref:`provisioned_setup`
- | Real world production mode involves a distributed deployment with generated startup
| kits from the provisioning process. Provides provisioning tool, dashboard, and
| various deployment options.
.. _starting_fl_simulator:

Expand Down Expand Up @@ -376,4 +406,18 @@ To start the server and client systems without an admin console:
nvflare poc start -ex [email protected]
We can use the :ref:`job_cli` to easily submit a job to the POC system. (Note: We can run the same jobs we ran with the simulator in POC mode. If using the :ref:`fed_job_api`, simply export the job configuration with ``job.export_job()``.)

.. code-block::
nvflare job submit -j NVFlare/examples/hello-world/hello-numpy-sag/jobs/hello-numpy-sag
.. code-block::
nvflare poc stop
.. code-block::
nvflare poc clean
For more details, see :ref:`poc_command`.
2 changes: 1 addition & 1 deletion docs/real_world_fl/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ For advanced users, you can customize your provision with additional behavior th
- **Zip**: To create password protected zip archives for the startup kits, see :ref:`distribution_builder`
- **Docker-compose**: Provision to launch NVIDIA FLARE system via docker containers. You can customize the provisioning process and ask the provisioner to generate a docker-compose file. This can be found in :ref:`docker_compose`.
- **Docker**: Provision to launch NVIDIA FLARE system via docker containers. If you just want to use docker files, see :ref:`containerized_deployment`.
- **Helm**: To change the provisioning tool to generate an NVIDIA FLARE Helm chart for Kubernetes deployment, see :ref:` helm_chart`.
- **Helm**: To change the provisioning tool to generate an NVIDIA FLARE Helm chart for Kubernetes deployment, see :ref:`helm_chart`.
- **CUSTOM**: you can build custom builders specific to your needs like in :ref:`distribution_builder`.

Package distribution
Expand Down
63 changes: 62 additions & 1 deletion docs/user_guide/nvflare_cli/poc_command.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ Proof Of Concept (POC) Command

The POC command allows users to try out the features of NVFlare in a proof of concept deployment on a single machine.

Different processes represent the server, clients, and the admin console, making it a useful tool in preparation for a distributed deployment.

Syntax and Usage
=================

Expand Down Expand Up @@ -292,7 +294,7 @@ will start ALL clients (site-1, site-2) and server as well as FLARE Console (aka
.. note::

If you prefer to have the FLARE Console on a different terminal, you can start everything else with: ``nvflare poc start -ex admin``.
If you prefer to have the FLARE Console on a different terminal, you can start everything else with: ``nvflare poc start -ex admin@nvidia.com``.

Start the server only
----------------------
Expand Down Expand Up @@ -356,6 +358,59 @@ If there is no GPU, then there will be no assignments. If there are GPUs, they w
nvidia-smi --list-gpus
Operating the System and Submitting a Job
==========================================
After preparing the poc workspace and starting the server, clients, and console (optional), we have several options to operate the whole system.

First, link the desired job directory to the admin's transfer directory:

.. code-block:: none
nvflare poc prepare-jobs-dir -j NVFlare/examples
FLARE Console
--------------
After starting the FLARE console with:

.. code-block:: none
nvflare poc start -p [email protected]
Login and submit the job:

.. code-block:: none
submit_job hello-world/hello-numpy-sag/jobs/hello-numpy-sag
Refer to :ref:`operating_nvflare` for more details.

FLARE API
---------
To programmatically operate the system and submit a job, use the :ref:`flare_api`:

.. code-block:: python
import os
from nvflare.fuel.flare_api.flare_api import new_secure_session
poc_workspace = "/tmp/nvflare/poc"
poc_prepared = os.path.join(poc_workspace, "example_project/prod_00")
admin_dir = os.path.join(poc_prepared, "[email protected]")
sess = new_secure_session("[email protected]", startup_kit_location=admin_dir)
job_id = sess.submit_job("hello-world/hello-numpy-sag/jobs/hello-numpy-sag")
print(f"Job is running with ID {job_id}")
Job CLI
-------
The :ref:`job_cli` also provides a convenient command to submit a job:

.. code-block:: none
nvflare job submit -j NVFlare/examples/hello-world/hello-numpy-sag/jobs/hello-numpy-sag
Stop Package(s)
===============

Expand All @@ -381,3 +436,9 @@ There is a command to clean up the POC workspace added in version 2.2 that will
.. code-block::
nvflare poc clean
Learn More
===========

To learn more about the different options of the POC command in more detail, see the
:github_nvflare_link:`Setup NVFLARE in POC Mode Tutorial <examples/tutorials/setup_poc.ipynb>`.
Loading

0 comments on commit 9d7e125

Please sign in to comment.