Quickstart
flwr new @flwrlabs/sprind-visionReadme
SPRIN-D: Image Classification
This Flower App allows you to federate the training of ResNet models on the ImageNet dataset for image classification using PyTorch. The ClientApp runnig from a SuperNode will stream a subset of the dataset directly from Huggingface. Aggregated metrics obtained during training and evaluation are logged to your Weight & Biases account if you configure it to do so.
The contents of this Flower App are as follows:
sprind-vision ├── image_classification │ ├── __init__.py │ ├── client_app.py # Defines your ClientApp │ ├── server_app.py # Defines your ServerApp │ ├── strategy.py # Defines a custom strategy for easy logging to W&B │ ├── dataset.py # Defines dataset preprocessing and streaming functionality │ └── task.py # Defines your model, training and data loading ├── pyproject.toml # Project metadata like dependencies and configs └── README.md
Running the App
NOTE
This section assumes you have already deployed a Flower Federation with at least two SuperNodes. Please refer to the provided instructions on how to connect SuperNodes to a running SuperLink.
Before running the app, you need to configure it to point to the SuperLink. This is an easy process and only requires you to edit one line in the pyproject.toml in this directory. Concretely, the address field found at the bottom of the file.
[tool.flwr.federations.sprind-federation] address = "SUPERLINK-CONTROL-ADDRESS" # <--- Replace with the provided SuperLink IP:PORT
To run the app with default settings simply execute this command from the directory where this README.md lives:
# If you know your Weight & Biases token flwr run . --run-config="wandb-token='<YOUR-WANDB-TOKEN'" --stream # If you don't have one flwr run . --stream
Expected Output
On the terminal where you execute flwr run from you'll see an output similiar to the one below. Note this output was obtained when running with Weight and Biases (hence the first few log lines with wandb prefix) and in a federation of 5 SuperNodes. By default, each round the ServerApp samples half of the connected SuperNodes for a round of training. Then, all the SuperNodes for a round of federated evaluation. By default the app runs for three rounds using a ResNet-50 model.
Loading project configuration... Success 🎊 Successfully started run 7522963691491767233 INFO : Starting logstream for run_id `7522963691491767233` INFO : Start `flwr-serverapp` process wandb: Currently logged in as: YOUR-USERNAME to https://api.wandb.ai. Use `wandb login --relogin` to force relogin wandb: Tracking run with wandb version 0.23.0 wandb: Run data is saved locally in <YOUR-LOCAL-FS>/wandb/run-20251125_174027-fnr1s6fq wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run 7522963691491767233-ServerApp wandb: ⭐️ View project at https://wandb.ai/YOUR-USERNAME/sprind-vision wandb: 🚀 View run at https://wandb.ai/YOUR-USERNAME/sprind-vision/runs/fnr1s6fq INFO : Starting FedAvgWithWandB strategy: INFO : ├── Number of rounds: 10 INFO : ├── ArrayRecord (97.73 MB) INFO : ├── ConfigRecord (train): {'lr': 0.01} INFO : ├── ConfigRecord (evaluate): (empty!) INFO : ├──> Sampling: INFO : │ ├──Fraction: train (0.50) | evaluate ( 1.00) INFO : │ ├──Minimum nodes: train (2) | evaluate (2) INFO : │ └──Minimum available nodes: 2 INFO : └──> Keys in records: INFO : ├── Weighted by: 'num-examples' INFO : ├── ArrayRecord key: 'arrays' INFO : └── ConfigRecord key: 'config' INFO : INFO : INFO : [ROUND 1/3] INFO : configure_train: Sampled 3 nodes (out of 5) INFO : aggregate_train: Received 3 results and 0 failures INFO : └──> Aggregated MetricRecord: {'train_loss': 19.972134065628055} INFO : configure_evaluate: Sampled 5 nodes (out of 5) INFO : aggregate_evaluate: Received 5 results and 0 failures INFO : └──> Aggregated MetricRecord: {'eval_loss': 24157916.1, 'eval_acc': 0.0} INFO : INFO : [ROUND 2/3] INFO : configure_train: Sampled 3 nodes (out of 5) INFO : aggregate_train: Received 3 results and 0 failures INFO : └──> Aggregated MetricRecord: {'train_loss': 14.02075376510621} INFO : configure_evaluate: Sampled 5 nodes (out of 5) INFO : aggregate_evaluate: Received 5 results and 0 failures INFO : └──> Aggregated MetricRecord: {'eval_loss': 14521418.0, 'eval_acc': 0.0} INFO : INFO : [ROUND 3/3] INFO : configure_train: Sampled 3 nodes (out of 5) INFO : aggregate_train: Received 3 results and 0 failures INFO : └──> Aggregated MetricRecord: {'train_loss': 10.710087394714355} INFO : configure_evaluate: Sampled 5 nodes (out of 5) INFO : aggregate_evaluate: Received 5 results and 0 failures INFO : └──> Aggregated MetricRecord: {'eval_loss': 5864870.0, 'eval_acc': 0.0} INFO : INFO : Strategy execution finished in 41.42s INFO : INFO : Final results: INFO : INFO : Global Arrays: INFO : ArrayRecord (97.734 MB) INFO : INFO : Aggregated ClientApp-side Train Metrics: INFO : { 1: {'train_loss': '1.9972e+01'}, INFO : 2: {'train_loss': '1.4021e+01'}, INFO : 3: {'train_loss': '1.0074e+01'}} INFO : INFO : Aggregated ClientApp-side Evaluate Metrics: INFO : { 1: {'eval_acc': '0.0000e+00', 'eval_loss': '2.4158e+07'}, INFO : 2: {'eval_acc': '0.0000e+00', 'eval_loss': '1.4521e+07'}, INFO : 3: {'eval_acc': '0.0000e+00', 'eval_loss': '5.5865e+06'}} INFO : INFO : ServerApp-side Evaluate Metrics: INFO : {}
Override Run Config
You can also override the settings for your ClientApp and ServerApp defined in the [tool.flwr.app.config] section of the pyproject.toml. This can be done by extending the list of arguments passed via the --run-config argument to flwr run. For example:
# Run for 5 rounds flwr run . --run-config="wandb-token='<YOUR-WANDB-TOKEN' num-server-rounds=5" --stream # Run for 5 rounds using the ResNet-101 model flwr run . --run-config="wandb-token='<YOUR-WANDB-TOKEN' num-server-rounds=5 model-name=resnet101" --stream