Skip to content

Commit

Permalink
Merge branch 'main' into bionemo_examples
Browse files Browse the repository at this point in the history
  • Loading branch information
holgerroth authored Feb 22, 2024
2 parents 427e679 + 3460970 commit 2dace43
Show file tree
Hide file tree
Showing 313 changed files with 8,272 additions and 3,220 deletions.
3 changes: 0 additions & 3 deletions .github/workflows/markdown-links-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,7 @@ name: Check Markdown links

on:
push:
branches: [ "main", "dev" ]
pull_request:
# The branches below must be a subset of the branches above
branches: [ "main", "dev" ]

jobs:
markdown-link-check:
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/premerge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,6 @@ name: pre-merge
on:
# quick tests for pull requests and the releasing branches
push:
branches:
- dev
pull_request:
workflow_dispatch:

Expand Down
5 changes: 4 additions & 1 deletion docs/_static/css/additions.css
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
.wy-menu-vertical li.toctree-l4.current li.toctree-l5>a{display:block;background:#b1b1b1;padding:.4045em 7.3em}
.wy-menu-vertical li.toctree-l5.current li.toctree-l6>a{display:block;background:#a9a9a9;padding:.4045em 8.8em}
.wy-menu-vertical li.toctree-l5{font-size: .9em;}
.wy-menu-vertical li.toctree-l5{font-size: .9em;}
.wy-menu > .caption > span.caption-text {
color: #76b900;
}
3 changes: 2 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def resolve_xref(self, env, fromdocname, builder, typ, target, node, contnode):
# -- Project information -----------------------------------------------------

project = "NVIDIA FLARE"
copyright = "2023, NVIDIA"
copyright = "2024, NVIDIA"
author = "NVIDIA"

# The full version, including alpha/beta/rc tags
Expand Down Expand Up @@ -114,6 +114,7 @@ def resolve_xref(self, env, fromdocname, builder, typ, target, node, contnode):
html_scaled_image_link = False
html_show_sourcelink = True
html_favicon = "favicon.ico"
html_logo = "resources/nvidia_logo.png"

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
Expand Down
47 changes: 36 additions & 11 deletions docs/example_applications_algorithms.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,7 @@ Example Applications
NVIDIA FLARE has several tutorials and examples to help you get started with federated learning and to explore certain features in the
:github_nvflare_link:`examples directory <examples>`.

1. Step-By-Step Example Series
==============================

* :github_nvflare_link:`Step-by-Step CIFAR-10 Examples (GitHub) <examples/hello-world/step-by-step/cifar10>` - Step-by-step examples series with CIFAR-10 (image data) to showcase to showcase different FLARE features, workflows, and APIs.
* :github_nvflare_link:`Step-by-Step HIGGS Examples (GitHub) <examples/hello-world/step-by-step/higgs>` - Step-by-step examples series with HIGGS (tabular data) to showcase to showcase different FLARE features, workflows, and APIs.

2. Hello World Examples
1. Hello World Examples
=======================
Can be run from the :github_nvflare_link:`hello_world notebook <examples/hello-world/hello_world.ipynb>`.

Expand All @@ -22,27 +16,58 @@ Can be run from the :github_nvflare_link:`hello_world notebook <examples/hello-w

examples/hello_world_examples

2.1. Deep Learning to Federated Learning
1.1. Deep Learning to Federated Learning
----------------------------------------

* :github_nvflare_link:`Deep Learning to Federated Learning (GitHub) <examples/hello-world/ml-to-fl>` - Example for converting Deep Learning (DL) to Federated Learning (FL) using the Client API.

2.2. Workflows
1.2. Workflows
--------------

* :ref:`Hello Scatter and Gather <hello_scatter_and_gather>` - Example using the Scatter And Gather (SAG) workflow with a Numpy trainer
* :ref:`Hello Cross-Site Validation <hello_cross_val>` - Example using the Cross Site Model Eval workflow with a Numpy trainer
* :ref:`Hello Cross-Site Validation <hello_cross_val>` - Example using the Cross Site Model Eval workflow with a Numpy trainer, also demonstrates running cross site validation using the previous training results.
* :github_nvflare_link:`Hello Cyclic Weight Transfer (GitHub) <examples/hello-world/hello-cyclic>` - Example using the CyclicController workflow to implement `Cyclic Weight Transfer <https://pubmed.ncbi.nlm.nih.gov/29617797/>`_ with TensorFlow as the deep learning training framework
* :github_nvflare_link:`Swarm Learning <examples/advanced/swarm_learning>` - Example using Swarm Learning and Client-Controlled Cross-site Evaluation workflows.
* :github_nvflare_link:`Client-Controlled Cyclic Weight Transfer <examples/hello-world/step-by-step/cifar10/cyclic_ccwf>` - Example using Client-Controlled Cyclic workflow using Client API.

2.3. Deep Learning
1.3. Deep Learning
------------------

* :ref:`Hello PyTorch <hello_pt>` - Example image classifier using FedAvg and PyTorch as the deep learning training framework
* :ref:`Hello TensorFlow <hello_tf2>` - Example image classifier using FedAvg and TensorFlow as the deep learning training frameworks



2. Step-By-Step Example Series
==============================

:github_nvflare_link:`Step-by-Step Examples (GitHub) <examples/hello-world/step-by-step/>` - Step-by-step examples series with CIFAR-10 (image data) and HIGGS (tabular data) to showcase different FLARE features, workflows, and APIs.

2.1 CIFAR-10 Image Data Examples
--------------------------------

* :github_nvflare_link:`image_stats <examples/hello-world/step-by-step/cifar10/stats/image_stats.ipynb>` - federated statistics (histograms) of CIFAR10.
* :github_nvflare_link:`sag <examples/hello-world/step-by-step/cifar10/sag/sag.ipynb>` - scatter and gather (SAG) workflow with PyTorch with Client API.
* :github_nvflare_link:`sag_deploy_map <examples/hello-world/step-by-step/cifar10/sag_deploy_map/sag_deploy_map.ipynb>` - scatter and gather workflow with deploy_map configuration for deployment of apps to different sites using the Client API.
* :github_nvflare_link:`sag_model_learner <examples/hello-world/step-by-step/cifar10/sag_model_learner/sag_model_learner.ipynb>` - scatter and gather workflow illustrating how to write client code using the ModelLearner.
* :github_nvflare_link:`sag_executor <examples/hello-world/step-by-step/cifar10/sag_executor/sag_executor.ipynb>` - scatter and gather workflow demonstrating show to write client-side executors.
* :github_nvflare_link:`sag_mlflow <examples/hello-world/step-by-step/cifar10/sag_mlflow/sag_mlflow.ipynb>` - MLflow experiment tracking logs with the Client API in scatter & gather workflows.
* :github_nvflare_link:`sag_he <examples/hello-world/step-by-step/cifar10/sag_he/sag_he.ipynb>` - homomorphic encyption using Client API and POC -he mode.
* :github_nvflare_link:`cse <examples/hello-world/step-by-step/cifar10/cse/cse.ipynb>` - cross-site evaluation using the Client API.
* :github_nvflare_link:`cyclic <examples/hello-world/step-by-step/cifar10/cyclic/cyclic.ipynb>` - cyclic weight transfer workflow with server-side controller.
* :github_nvflare_link:`cyclic_ccwf <examples/hello-world/step-by-step/cifar10/cyclic_ccwf/cyclic_ccwf.ipynb>` - client-controlled cyclic weight transfer workflow with client-side controller.
* :github_nvflare_link:`swarm <examples/hello-world/step-by-step/cifar10/swarm/swarm.ipynb>` - swarm learning and client-side cross-site evaluation with Client API.

2.2 HIGGS Tabular Data Examples
-------------------------------

* :github_nvflare_link:`tabular_stats <examples/hello-world/step-by-step/higgs/stats/tabular_stats.ipynb>`- federated stats tabular histogram calculation.
* :github_nvflare_link:`sklearn_linear <examples/hello-world/step-by-step/higgs/sklearn-linear/sklearn_linear.ipynb>`- federated linear model (logistic regression on binary classification) learning on tabular data.
* :github_nvflare_link:`sklearn_svm <examples/hello-world/step-by-step/higgs/sklearn-svm/sklearn_svm.ipynb>`- federated SVM model learning on tabular data.
* :github_nvflare_link:`sklearn_kmeans <examples/hello-world/step-by-step/higgs/sklearn-kmeans/sklearn_kmeans.ipynb>`- federated k-Means clustering on tabular data.
* :github_nvflare_link:`xgboost <examples/hello-world/step-by-step/higgs/xgboost/xgboost_horizontal.ipynb>`- federated horizontal xgboost learning on tabular data with bagging collaboration.


3. Tutorial Notebooks
=====================

Expand Down
12 changes: 6 additions & 6 deletions docs/examples/fl_experiment_tracking_mlflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,10 @@ Adding MLflow Logging to Configurations

Inside the config folder there are two files, ``config_fed_client.json`` and ``config_fed_server.json``.

.. literalinclude:: ../../examples/advanced/experiment-tracking/mlflow/jobs/hello-pt-mlflow/app/config/config_fed_client.json
:language: json
.. literalinclude:: ../../examples/advanced/experiment-tracking/mlflow/jobs/hello-pt-mlflow/app/config/config_fed_client.conf
:language:
:linenos:
:caption: config_fed_client.json
:caption: config_fed_client.conf

Take a look at the components section of the client config at line 24.
The first component is the ``pt_learner`` which contains the initialization, training, and validation logic.
Expand All @@ -69,10 +69,10 @@ within NVFlare with the information to track.
Finally, :class:`ConvertToFedEvent<nvflare.app_common.widgets.convert_to_fed_event.ConvertToFedEvent>` converts local events to federated events.
This changes the event ``analytix_log_stats`` into a fed event ``fed.analytix_log_stats``, which will then be streamed from the clients to the server.

.. literalinclude:: ../../examples/advanced/experiment-tracking/mlflow/jobs/hello-pt-mlflow/app/config/config_fed_server.json
:language: json
.. literalinclude:: ../../examples/advanced/experiment-tracking/mlflow/jobs/hello-pt-mlflow/app/config/config_fed_server.conf
:language:
:linenos:
:caption: config_fed_server.json
:caption: config_fed_server.conf

Under the component section in the server config, we have the
:class:`MLflowReceiver<nvflare.app_opt.tracking.mlflow.mlflow_receiver.MLflowReceiver>`. This component receives
Expand Down
64 changes: 64 additions & 0 deletions docs/fl_introduction.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
.. _fl_introduction:

###########################
What is Federated Learning?
###########################

Federated Learning is a distributed learning paradigm where training occurs across multiple clients, each with their own local datasets.
This enables the creation of common robust models without sharing sensitive local data, helping solve issues of data privacy and security.

How does Federated Learning Work?
=================================
The federated learning (FL) server orchestrates the collaboration of multiple clients by first sending an initial model to the FL clients.
The clients perform training on their local datasets, then send the model updates back to the FL server for aggregation to form a global model.
This process forms a single round of federated learning and after a number of rounds, a robust global model can be developed.

.. image:: resources/fl_diagram.png
:height: 500px
:align: center

FL Terms and Definitions
========================

- FL server: manages job lifecycle, orchestrates workflow, assigns tasks to clients, performs aggregation
- FL client: executes tasks, performs local computation/learning with local dataset, submits result back to FL server
- FL algorithms: FedAvg, FedOpt, FedProx etc. implemented as workflows

.. note::

Here we describe the centralized version of FL, where the FL server has the role of the aggregrator node. However in a decentralized version such as
swarm learning, FL clients can serve as the aggregator node instead.

- Types of FL

- horizontal FL: clients hold different data samples over the same features
- vertical FL: clients hold different features over an overlapping set of data samples
- swarm learning: a decentralized subset of FL where orchestration and aggregation is performed by the clients

Main Benefits
=============

Enhanced Data Privacy and Security
----------------------------------
Federated learning facilitates data privacy and data locality by ensuring that the data remains at each site.
Additionally, privacy preserving techniques such as homomorphic encryption and differential privacy filters can also be leveraged to further protect the transferred data.

Improved Accuracy and Diversity
-------------------------------
By training with a variety of data sources across different clients, a robust and generalizable global model can be developed to better represent heterogeneous datasets.

Scalability and Network Efficiency
----------------------------------
With the ability to perform training at the edge, federated learning can be highly scalable across the globe.
Additionally only needing to transfer the model weights rather than entire datasets enables efficient use of network resources.

Applications
============
An important application of federated learning is in the healthcare sector, where data privacy regulations and patient record confidentiality make training models challenging.
Federated learning can help break down these healthcare data silos to allow hospitals and medical institutions to collaborate and pool their medical knowledge without the need to share their data.
Some common use cases involve classification and detection tasks, drug discovery with federated protein LLMs, and federated analytics on medical devices.

Furthermore there are many other areas and industries such as financial fraud detection, autonomous vehicles, HPC, mobile applications, etc.
where the ability to use distributed data silos while maintaining data privacy is essential for the development of better models.

Read on to learn how FLARE is built as a flexible federated computing framework to enable federated learning from research to production.
2 changes: 1 addition & 1 deletion docs/flare_overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Built for productivity
FLARE is designed for maximum productivity, providing a range of tools to enhance user experience and research efficiency at different stages of the development process:

- **FLARE Client API:** Enables users to transition seamlessly from ML/DL to FL with just a few lines of code changes.
- **Simulator CLI:** Allows users to simulate federated learning or computing jobs in multi-thread settings within a single computer, offering quick response and debugging. The same job can be deployed directly to production.
- **Simulator CLI:** Allows users to simulate federated learning or computing jobs in multi-process settings within a single computer, offering quick response and debugging. The same job can be deployed directly to production.
- **POC CLI:** Facilitates the simulation of federated learning or computing jobs in multi-process settings within one computer. Different processes represent server, clients, and an admin console, providing users with a realistic sense of the federated network. It also allows users to simulate project deployment on a single host.
- **Job CLI:** Permits users to create and submit jobs directly in POC or production environments.
- **FLARE API:** Enables users to run jobs directly from Python code or notebooks.
Expand Down
20 changes: 16 additions & 4 deletions docs/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@ Clone NVFLARE repo to get examples, switch main branch (latest stable branch)
$ git clone https://github.com/NVIDIA/NVFlare.git
$ cd NVFlare
$ git switch main
$ git switch 2.4
Note on branches:

* The `main <https://github.com/NVIDIA/NVFlare/tree/main>`_ branch is the default (unstable) development branch

* The 2.0, 2.1, 2.2, and 2.3 etc. branches are the branches for each major release and minor patches
* The 2.1, 2.2, 2.3, and 2.4 etc. branches are the branches for each major release and minor patches


Quick Start with Simulator
Expand Down Expand Up @@ -63,6 +63,14 @@ establishing a secure, distributed FL workflow.
Installation
=============

.. note::
The server and client versions of nvflare must match, we do not support cross-version compatibility.

Supported Operating Systems
---------------------------
- Linux
- OSX (Note: some optional dependencies are not compatible, such as tenseal and openmined.psi)

Python Version
--------------

Expand Down Expand Up @@ -117,7 +125,6 @@ You may find that the pip and setuptools versions in the venv need updating:
(nvflare-env) $ python3 -m pip install -U pip
(nvflare-env) $ python3 -m pip install -U setuptools
Install Stable Release
----------------------

Expand All @@ -127,6 +134,11 @@ Stable releases are available on `NVIDIA FLARE PyPI <https://pypi.org/project/nv
$ python3 -m pip install nvflare
.. note::

In addition to the dependencies included when installing nvflare, many of our example applications have additional packages that must be installed.
Make sure to install from any requirement.txt files before running the examples.
See :github_nvflare_link:`nvflare/app_opt <nvflare/app_opt>` for modules and components with optional dependencies.

.. _containerized_deployment:

Expand Down Expand Up @@ -210,7 +222,7 @@ Production mode is secure with TLS certificates - depending the choice the deplo

- HA or non-HA
- Local or remote
- On-premise or on cloud
- On-premise or on cloud (See :ref:`cloud_deployment`)

Using non-HA, secure, local mode (all clients and server running on the same host), production mode is very similar to POC mode except it is secure.

Expand Down
31 changes: 24 additions & 7 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,23 +5,37 @@ NVIDIA FLARE
.. toctree::
:maxdepth: -1
:hidden:
:caption: Introduction

fl_introduction
flare_overview
whats_new
getting_started

.. toctree::
:maxdepth: -1
:hidden:
:caption: Guides

example_applications_algorithms
real_world_fl
user_guide
programming_guide
best_practices

.. toctree::
:maxdepth: -1
:hidden:
:caption: Miscellaneous

faq
publications_and_talks
contributing
API <apidocs/modules>
glossary

NVIDIA FLARE (NVIDIA Federated Learning Application Runtime Environment) is a domain-agnostic, open-source, extensible SDK that allows
researchers and data scientists to adaptexisting ML/DL workflows (PyTorch, RAPIDS, Nemo, TensorFlow) to a federated paradigm; and enables
researchers and data scientists to adapt existing ML/DL workflows (PyTorch, RAPIDS, Nemo, TensorFlow) to a federated paradigm; and enables
platform developers to build a secure, privacy preserving offering for a distributed multi-party collaboration.

NVIDIA FLARE is built on a componentized architecture that gives you the flexibility to take federated learning workloads from research
Expand All @@ -34,18 +48,21 @@ and simulation to real-world production deployment. Some of the key components
- **Management tools** for secure provisioning and deployment, orchestration, and management
- **Specification-based API** for extensibility

Learn more in the :ref:`FLARE Overview <flare_overview>`, :ref:`Key Features <key_features>`, :ref:`What's New <whats_new>`, and the
:ref:`User Guide <user_guide>` and :ref:`Programming Guide <programming_guide>`.
Learn more about FLARE features in the :ref:`FLARE Overview <flare_overview>` and :ref:`What's New <whats_new>`.

Getting Started
===============
For first-time users and FL researchers, FLARE provides the :ref:`fl_simulator` that allows you to build, test, and deploy applications locally.
The :ref:`Getting Started guide <getting_started>` covers installation and walks through an example application using the FL Simulator.
For first-time users and FL researchers, FLARE provides the :ref:`FL Simulator <fl_simulator>` that allows you to build, test, and deploy applications locally.
The :ref:`Getting Started <getting_started>` guide covers installation and walks through an example application using the FL Simulator.
Additional examples can be found at the :ref:`Examples Applications <example_applications_algorithms>`, which showcase different federated learning workflows and algorithms on various machine learning and deep learning tasks.

FLARE for Users
===============
If you want to learn how to interact with the FLARE system, please refer to the :ref:`User Guide <user_guide>`.
When you are ready to for a secure, distributed deployment, the :ref:`Real World Federated Learning <real_world_fl>` section covers the tools and process
required to deploy and operate a secure, real-world FLARE project.

FLARE for Developers
====================
When you're ready to build your own application, the :ref:`Programming Best Practices <best_practices>`, :ref:`FAQ<faq>`, and
:ref:`Programming Guide <programming_guide>` give an in depth look at the FLARE platform and APIs.
When you're ready to build your own application, the :ref:`Programming Guide <programming_guide>`, :ref:`Programming Best Practices <best_practices>`, :ref:`FAQ<faq>`, and :ref:`API Reference <apidocs/modules>`
give an in depth look at the FLARE platform and APIs.
Loading

0 comments on commit 2dace43

Please sign in to comment.