To run the project using a Docker container, execute the following command:
# Start the Docker container
docker run -it --rm \
--shm-size=16g \
--ipc=host \
--ulimit memlock=-1 \
--ulimit stack=67108864 \
-v ./docker_config/NVFlare:/workspace/nvflare \
--gpus=all \
-v ./:/workspace \
jefftud/nvflare-pt-dev:cifar10
--shm-size=16g
: Allocates shared memory.--ipc=host
: Shares IPC namespace with the host.--ulimit memlock=-1
: Removes memory locking limits.--ulimit stack=67108864
: Increases the stack size limit.
To prepare the data splits for CIFAR-10, run the following script:
./application/jobs/cifar10/prepare_data.sh
The Federated Learning (FL) Simulator is a lightweight tool that uses threads to simulate multiple clients. It’s useful for local testing and debugging. To run the simulator, use the following command:
nvflare simulator ./application/jobs/cifar10 -w /tmp/nvflare/cifar10 -n 2 -t 2
-w /tmp/nvflare/cifar10
: Specifies the workspace directory.-n 2
: Sets the number of clients.-t 2
: Defines the number of threads.
For detailed instructions on how to use the simulator, refer to the official NVFLARE Quick Start with Simulator Guide.
To run this application code in NVFlare proof-of-concept mode, the dataset first needs to be downloaded (see above).
Then, subsets need to be prepared. You can re-use the ones created by running the simulator as above (TODO how can they be created without running the full simulation?)
mv /tmp/cifar10_splits/site-1.npy /tmp/cifar10_splits/poc_client_0.npy
mv /tmp/cifar10_splits/site-2.npy /tmp/cifar10_splits/poc_client_1.npy
Next, further preprations are needed
nvflare poc prepare -c poc_client_0 poc_client_1
nvflare poc prepare-jobs-dir -j application/jobs/
Now proceed interactively by
nvflare poc start
wait until you are prompted for further steps, or proceed non-interactively by
nvflare poc start -ex [email protected]
sleep 15
nvflare job submit -j application/jobs/cifar10
where it is necessary to wait for starting in the background.
Finally, you can clean up by
nvflare poc stop
nvflare poc clean
or just exit the container.