Merge branch 'NVIDIA:main' into gnn_enc

NVIDIA · Oct 14, 2024 · 9d7e125 · 9d7e125
2 parents a35ae31 + a5a6353
commit 9d7e125
Show file tree

Hide file tree

Showing 28 changed files with 826 additions and 780 deletions.
diff --git a/docs/programming_guide/controllers/model_controller.rst b/docs/programming_guide/controllers/model_controller.rst
@@ -228,8 +228,9 @@ For example we can use PyTorch's save and load functions for the model parameter
             return model
 
 
-Note: for non-primitive data types such as ``torch.nn.Module`` (used for the initial PyTorch model), we must configure a corresponding FOBS decomposer for serialization and deserialization.
-Read more at :github_nvflare_link:`Flare Object Serializer (FOBS) <nvflare/fuel/utils/fobs/README.rst>`.
+Note: for non-primitive data types such as ``torch.nn.Module`` (used for the initial PyTorch model),
+we must configure a corresponding FOBS decomposer for serialization and deserialization.
+Read more at :ref:`serialization`.
 
 .. code-block:: python
 

diff --git a/docs/programming_guide/execution_api_type/client_api.rst b/docs/programming_guide/execution_api_type/client_api.rst
@@ -261,7 +261,24 @@ Client API configuration.  You can further nail down the selection by choice of
 in-process or not, type of models ( GNN, NeMo LLM), workflow patterns ( Swarm learning or standard fedavg with scatter and gather (sag)) etc.
 
 
+Custom Data Class Serialization/Deserialization
+===============================================
 
+To pass data in the form of a custom class, you can leverage the serialization tool inside NVFlare.
 
+For example:
 
+.. code-block:: python
+
+    class CustomClass:
+        def __init__(self, x, y):
+            self.x = 1
+            self.y = 2
+
+If you are using classes derived from ``Enum`` or dataclass, they will be handled by the default decomposers.
+For other custom classes, you will need to write a dedicated custom decomposer and ensure it is registered
+using fobs.register on both the server side and client side, as well as in train.py.
+
+Please note that for the custom data class to work, it must be placed in a separate file from train.py.
 
+For more details on serialization, please refer to :ref:`serialization`.
diff --git a/docs/quickstart.rst b/docs/quickstart.rst
@@ -101,6 +101,17 @@ Note on branches:
 
 * The 2.1, 2.2, 2.3, 2.4, 2.5, etc. branches are the branches for each major release and there are tags based on these with a third digit for minor patches
 
+Install NVFlare from source
+----------------------------
+
+Navigate to the NVFlare repository and use pip install with development mode (can be useful to access latest nightly features or test custom builds for example):
+
+.. code-block:: shell
+
+  $ git clone https://github.com/NVIDIA/NVFlare.git
+  $ cd NVFlare
+  $ pip install -e .
+
 
 .. _containerized_deployment:
 
@@ -197,11 +208,30 @@ Production mode is secure with TLS certificates - depending the choice the deplo
 
 Using non-HA, secure, local mode (all clients and server running on the same host), production mode is very similar to POC mode except it is secure.
 
-Which mode should I choose for running NVFLARE?
-
-  - For a quick research run, use the FL Simulator
-  - For simulating real cases within the same machine, use POC or production (local, non-HA, secure) mode. POC has convenient ``nvflare poc`` commands for ease of use.
-  - For all other cases, use production mode.
+Which mode should I choose for running NVFLARE? (Note: the same jobs can be run in any of the modes, and the same project.yml deployment options can be run in both POC mode and production.)
+
+.. list-table:: NVIDIA FLARE Modes
+   :header-rows: 1
+
+   * - **Mode**
+     - **Documentation**
+     - **Description**
+   * - Simulator
+     - :ref:`fl_simulator`
+     - | The FL Simulator is a light weight simulation where the job run is automated on a 
+       | single system. Useful for quickly running a job or experimenting with research 
+       | or FL algorithms.
+   * - POC
+     - :ref:`poc_command`
+     - | POC mode establishes and connects distinct server and client "systems" which can 
+       | then be orchestrated using the FLARE Console all from a single machine. Users can 
+       | also experiment with various deployment options (project.yml), which can be used 
+       | in production modes.
+   * - Production
+     - :ref:`provisioned_setup`
+     - | Real world production mode involves a distributed deployment with generated startup 
+       | kits from the provisioning process. Provides provisioning tool, dashboard, and 
+       | various deployment options.
 
 .. _starting_fl_simulator:
 
@@ -376,4 +406,18 @@ To start the server and client systems without an admin console:
 
   nvflare poc start -ex [email protected]
 
+We can use the :ref:`job_cli` to easily submit a job to the POC system. (Note: We can run the same jobs we ran with the simulator in POC mode. If using the :ref:`fed_job_api`, simply export the job configuration with ``job.export_job()``.)
+
+.. code-block::
+
+  nvflare job submit -j NVFlare/examples/hello-world/hello-numpy-sag/jobs/hello-numpy-sag
+
+.. code-block::
+
+  nvflare poc stop
+
+.. code-block::
+
+  nvflare poc clean
+
 For more details, see :ref:`poc_command`.
diff --git a/docs/real_world_fl/overview.rst b/docs/real_world_fl/overview.rst
@@ -68,7 +68,7 @@ For advanced users, you can customize your provision with additional behavior th
     - **Zip**: To create password protected zip archives for the startup kits, see :ref:`distribution_builder`
     - **Docker-compose**: Provision to launch NVIDIA FLARE system via docker containers. You can customize the provisioning process and ask the provisioner to generate a docker-compose file. This can be found in :ref:`docker_compose`.
     - **Docker**: Provision to launch NVIDIA FLARE system via docker containers. If you just want to use docker files, see :ref:`containerized_deployment`.
-    - **Helm**: To change the provisioning tool to generate an NVIDIA FLARE Helm chart for Kubernetes deployment, see :ref:` helm_chart`.
+    - **Helm**: To change the provisioning tool to generate an NVIDIA FLARE Helm chart for Kubernetes deployment, see :ref:`helm_chart`.
     - **CUSTOM**: you can build custom builders specific to your needs like in :ref:`distribution_builder`.
 
 Package distribution

diff --git a/docs/user_guide/nvflare_cli/poc_command.rst b/docs/user_guide/nvflare_cli/poc_command.rst
@@ -7,6 +7,8 @@ Proof Of Concept (POC) Command
 
 The POC command allows users to try out the features of NVFlare in a proof of concept deployment on a single machine.
 
+Different processes represent the server, clients, and the admin console, making it a useful tool in preparation for a distributed deployment.
+
 Syntax and Usage
 =================
 
@@ -292,7 +294,7 @@ will start ALL clients (site-1, site-2) and server as well as FLARE Console (aka
 
 .. note::
 
-    If you prefer to have the FLARE Console on a different terminal, you can start everything else with: ``nvflare poc start -ex admin``.
+    If you prefer to have the FLARE Console on a different terminal, you can start everything else with: ``nvflare poc start -ex admin@nvidia.com``.
 
 Start the server only
 ----------------------
@@ -356,6 +358,59 @@ If there is no GPU, then there will be no assignments. If there are GPUs, they w
 
            nvidia-smi --list-gpus
 
+Operating the System and Submitting a Job
+==========================================
+After preparing the poc workspace and starting the server, clients, and console (optional), we have several options to operate the whole system.
+
+First, link the desired job directory to the admin's transfer directory:
+
+.. code-block:: none
+
+    nvflare poc prepare-jobs-dir -j NVFlare/examples
+
+FLARE Console
+--------------
+After starting the FLARE console with:
+
+.. code-block:: none
+
+    nvflare poc start -p [email protected]
+
+Login and submit the job:
+
+.. code-block:: none
+
+    submit_job hello-world/hello-numpy-sag/jobs/hello-numpy-sag
+
+Refer to :ref:`operating_nvflare` for more details.
+
+FLARE API
+---------
+To programmatically operate the system and submit a job, use the :ref:`flare_api`:
+
+.. code-block:: python
+
+    import os
+    from nvflare.fuel.flare_api.flare_api import new_secure_session
+
+    poc_workspace = "/tmp/nvflare/poc"
+    poc_prepared = os.path.join(poc_workspace, "example_project/prod_00")
+    admin_dir = os.path.join(poc_prepared, "[email protected]")
+    sess = new_secure_session("[email protected]", startup_kit_location=admin_dir)
+    job_id = sess.submit_job("hello-world/hello-numpy-sag/jobs/hello-numpy-sag")
+
+    print(f"Job is running with ID {job_id}")
+
+
+Job CLI
+-------
+The :ref:`job_cli` also provides a convenient command to submit a job:
+
+.. code-block:: none
+
+    nvflare job submit -j NVFlare/examples/hello-world/hello-numpy-sag/jobs/hello-numpy-sag
+
+
 Stop Package(s)
 ===============
 
@@ -381,3 +436,9 @@ There is a command to clean up the POC workspace added in version 2.2 that will
 .. code-block::
 
     nvflare poc clean
+
+Learn More
+===========
+
+To learn more about the different options of the POC command in more detail, see the 
+:github_nvflare_link:`Setup NVFLARE in POC Mode Tutorial <examples/tutorials/setup_poc.ipynb>`.