Quickstart PyTorch Lightningยถ

In this federated learning tutorial we will learn how to train an AutoEncoder model on MNIST using Flower and PyTorch Lightning. It is recommended to create a virtual environment and run everything within a virtualenv.

Then, clone the code example directly from GitHub:

git clone --depth=1 https://github.com/adap/flower.git _tmp \
             && mv _tmp/examples/quickstart-pytorch-lightning . \
             && rm -rf _tmp && cd quickstart-pytorch-lightning

This will create a new directory called quickstart-pytorch-lightning containing the following files:

quickstart-pytorch-lightning
โ”œโ”€โ”€ pytorchlightning_example
โ”‚   โ”œโ”€โ”€ client_app.py   # Defines your ClientApp
โ”‚   โ”œโ”€โ”€ server_app.py   # Defines your ServerApp
โ”‚   โ””โ”€โ”€ task.py         # Defines your model, training and data loading
โ”œโ”€โ”€ pyproject.toml      # Project metadata like dependencies and configs
โ””โ”€โ”€ README.md

Next, activate your environment, then run:

# Navigate to the example directory
$ cd path/to/quickstart-pytorch-lightning

# Install project and dependencies
$ pip install -e .

By default, Flower Simulation Engine will be started and it will create a federation of 4 nodes using FedAvg as the aggregation strategy. The dataset will be partitioned using Flower Datasetโ€™s IidPartitioner. To run the project, do:

# Run with default arguments
$ flwr run .

With default arguments you will see an output like this one:

Loading project configuration...
Success
INFO :      Starting FedAvg strategy:
INFO :          โ”œโ”€โ”€ Number of rounds: 3
INFO :          โ”œโ”€โ”€ ArrayRecord (0.39 MB)
INFO :          โ”œโ”€โ”€ ConfigRecord (train): (empty!)
INFO :          โ”œโ”€โ”€ ConfigRecord (evaluate): (empty!)
INFO :          โ”œโ”€โ”€> Sampling:
INFO :          โ”‚       โ”œโ”€โ”€Fraction: train (0.50) | evaluate ( 0.50)
INFO :          โ”‚       โ”œโ”€โ”€Minimum nodes: train (2) | evaluate (2)
INFO :          โ”‚       โ””โ”€โ”€Minimum available nodes: 2
INFO :          โ””โ”€โ”€> Keys in records:
INFO :                  โ”œโ”€โ”€ Weighted by: 'num-examples'
INFO :                  โ”œโ”€โ”€ ArrayRecord key: 'arrays'
INFO :                  โ””โ”€โ”€ ConfigRecord key: 'config'
INFO :
INFO :
INFO :      [ROUND 1/3]
INFO :      configure_train: Sampled 2 nodes (out of 4)
INFO :      aggregate_train: Received 2 results and 0 failures
INFO :          โ””โ”€โ”€> Aggregated MetricRecord: {'train_loss': 0.0487}
INFO :      configure_evaluate: Sampled 2 nodes (out of 4)
INFO :      aggregate_evaluate: Received 2 results and 0 failures
INFO :          โ””โ”€โ”€> Aggregated MetricRecord: {'eval_loss': 0.0495}
INFO :
INFO :      [ROUND 2/3]
INFO :      configure_train: Sampled 2 nodes (out of 4)
INFO :      aggregate_train: Received 2 results and 0 failures
INFO :          โ””โ”€โ”€> Aggregated MetricRecord: {'train_loss': 0.0420}
INFO :      configure_evaluate: Sampled 2 nodes (out of 4)
INFO :      aggregate_evaluate: Received 2 results and 0 failures
INFO :          โ””โ”€โ”€> Aggregated MetricRecord: {'eval_loss': 0.0455}
INFO :
INFO :      [ROUND 3/3]
INFO :      configure_train: Sampled 2 nodes (out of 4)
INFO :      aggregate_train: Received 2 results and 0 failures
INFO :          โ””โ”€โ”€> Aggregated MetricRecord: {'train_loss': 0.05082}
INFO :      configure_evaluate: Sampled 2 nodes (out of 4)
INFO :      aggregate_evaluate: Received 2 results and 0 failures
INFO :          โ””โ”€โ”€> Aggregated MetricRecord: {'eval_loss': 0.0441}
INFO :
INFO :      Strategy execution finished in 159.24s
INFO :
INFO :      Final results:
INFO :
INFO :          Global Arrays:
INFO :                  ArrayRecord (0.389 MB)
INFO :
INFO :          Aggregated ClientApp-side Train Metrics:
INFO :          { 1: {'train_loss': '4.8696e-02'},
INFO :            2: {'train_loss': '4.1957e-02'},
INFO :            3: {'train_loss': '5.0818e-02'}}
INFO :
INFO :          Aggregated ClientApp-side Evaluate Metrics:
INFO :          { 1: {'eval_loss': '4.9516e-02'},
INFO :            2: {'eval_loss': '4.5510e-02'},
INFO :            3: {'eval_loss': '4.4052e-02'}}
INFO :
INFO :          ServerApp-side Evaluate Metrics:
INFO :          {}
INFO :

Each simulated ClientApp (two per round) will also log a summary of their local training process. Expect this output to be similar to:

# The left part indicates the process ID running the `ClientApp`
(ClientAppActor pid=38155) โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
(ClientAppActor pid=38155) โ”ƒ        Test metric        โ”ƒ       DataLoader 0        โ”ƒ
(ClientAppActor pid=38155) โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
(ClientAppActor pid=38155) โ”‚         test_loss         โ”‚   0.045175597071647644    โ”‚
(ClientAppActor pid=38155) โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

You can also override the parameters defined in the [tool.flwr.app.config] section in pyproject.toml like this:

# Override some arguments
$ flwr run . --run-config num-server-rounds=5

ํŒ

Check the ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์‹คํ–‰ documentation to learn more about how to configure and run Flower simulations.

์ฐธ๊ณ 

Check the source code of this tutorial in examples/quickstart-pytorch-lightning in the Flower GitHub repository.