Démarrage rapide de PyTorch Lightning

In this federated learning tutorial we will learn how to train an AutoEncoder model on MNIST using Flower and PyTorch Lightning. It is recommended to create a virtual environment and run everything within a virtualenv.

Then, clone the code example directly from GitHub:

git clone --depth=1 https://github.com/adap/flower.git _tmp \
             && mv _tmp/examples/quickstart-pytorch-lightning . \
             && rm -rf _tmp && cd quickstart-pytorch-lightning

This will create a new directory called quickstart-pytorch-lightning containing the following files:

quickstart-pytorch-lightning
├── pytorchlightning_example
│   ├── client_app.py   # Defines your ClientApp   ├── server_app.py   # Defines your ServerApp   └── task.py         # Defines your model, training and data loading
├── pyproject.toml      # Project metadata like dependencies and configs
└── README.md

Next, activate your environment, then run:

# Navigate to the example directory
$ cd path/to/quickstart-pytorch-lightning

# Install project and dependencies
$ pip install -e .

By default, this project uses a local simulation profile that flwr run submits to a managed local SuperLink, which then executes the run with the Flower Simulation Runtime. It creates a federation of 4 nodes using FedAvg as the aggregation strategy. The dataset will be partitioned using Flower Dataset’s IidPartitioner. To run the project, do:

# Run with default arguments and stream logs
$ flwr run . --stream

Plain flwr run . submits the run, prints the run ID, and returns without streaming logs. For the full local workflow, see Run Flower Locally with a Managed SuperLink.

With default arguments you will see streamed output like this:

Successfully built flwrlabs.quickstart-pytorch-lightning.1-0-0.014c8eb3.fab
Starting local SuperLink on 127.0.0.1:39093...
Successfully started run 1859953118041441032
INFO :      Starting FedAvg strategy:
INFO :          ├── Number of rounds: 3
INFO :      [ROUND 1/3]
INFO :      configure_train: Sampled 2 nodes (out of 4)
INFO :      aggregate_train: Received 2 results and 0 failures
INFO :          └──> Aggregated MetricRecord: {'train_loss': 0.0487}
INFO :      configure_evaluate: Sampled 2 nodes (out of 4)
INFO :      aggregate_evaluate: Received 2 results and 0 failures
INFO :          └──> Aggregated MetricRecord: {'eval_loss': 0.0495}
INFO :      [ROUND 2/3]
INFO :      ...
INFO :      [ROUND 3/3]
INFO :      ...
INFO :      Strategy execution finished in 159.24s
INFO :      Final results:
INFO :          ServerApp-side Evaluate Metrics:
INFO :          {}

Each simulated ClientApp (two per round) will also log a summary of their local training process. Expect this output to be similar to:

# The left part indicates the process ID running the `ClientApp`
(ClientAppActor pid=38155) ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
(ClientAppActor pid=38155)         Test metric               DataLoader 0        (ClientAppActor pid=38155) ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
(ClientAppActor pid=38155)          test_loss            0.045175597071647644    (ClientAppActor pid=38155) └───────────────────────────┴───────────────────────────┘

You can also override the parameters defined in the [tool.flwr.app.config] section in pyproject.toml like this:

# Override some arguments
$ flwr run . --run-config num-server-rounds=5

Astuce

Check the Run simulations documentation to learn more about how to configure and run Flower simulations.

Note

Check the source code of this tutorial in examples/quickstart-pytorch-lightning in the Flower GitHub repository.