@flwrlabs/sprind-vision

Quickstart

flwr new @flwrlabs/sprind-vision

Readme

SPRIN-D: Image Classification

This Flower App allows you to federate the training of ResNet models on the ImageNet dataset for image classification using PyTorch. The ClientApp runnig from a SuperNode will stream a subset of the dataset directly from Huggingface. Aggregated metrics obtained during training and evaluation are logged to your Weight & Biases account if you configure it to do so.

The contents of this Flower App are as follows:

sprind-vision
├── image_classification
│   ├── __init__.py
│   ├── client_app.py   # Defines your ClientApp
│   ├── server_app.py   # Defines your ServerApp
│   ├── strategy.py     # Defines a custom strategy for easy logging to W&B
│   ├── dataset.py      # Defines dataset preprocessing and streaming functionality
│   └── task.py         # Defines your model, training and data loading
├── pyproject.toml      # Project metadata like dependencies and configs
└── README.md

Running the App

NOTE

This section assumes you have already deployed a Flower Federation with at least two SuperNodes. Please refer to the provided instructions on how to connect SuperNodes to a running SuperLink.

Before running the app, you need to configure it to point to the SuperLink. This is an easy process and only requires you to edit one line in the pyproject.toml in this directory. Concretely, the address field found at the bottom of the file.

[tool.flwr.federations.sprind-federation]
address = "SUPERLINK-CONTROL-ADDRESS" # <--- Replace with the provided SuperLink IP:PORT

To run the app with default settings simply execute this command from the directory where this README.md lives:

# If you know your Weight & Biases token
flwr run . --run-config="wandb-token='<YOUR-WANDB-TOKEN'" --stream

# If you don't have one
flwr run . --stream

Expected Output

On the terminal where you execute flwr run from you'll see an output similiar to the one below. Note this output was obtained when running with Weight and Biases (hence the first few log lines with wandb prefix) and in a federation of 5 SuperNodes. By default, each round the ServerApp samples half of the connected SuperNodes for a round of training. Then, all the SuperNodes for a round of federated evaluation. By default the app runs for three rounds using a ResNet-50 model.

Loading project configuration...
Success
🎊 Successfully started run 7522963691491767233
INFO :      Starting logstream for run_id `7522963691491767233`
INFO :      Start `flwr-serverapp` process
wandb: Currently logged in as: YOUR-USERNAME to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.23.0
wandb: Run data is saved locally in <YOUR-LOCAL-FS>/wandb/run-20251125_174027-fnr1s6fq
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run 7522963691491767233-ServerApp
wandb: ⭐️ View project at https://wandb.ai/YOUR-USERNAME/sprind-vision
wandb: 🚀 View run at https://wandb.ai/YOUR-USERNAME/sprind-vision/runs/fnr1s6fq
INFO :      Starting FedAvgWithWandB strategy:
INFO :          ├── Number of rounds: 10
INFO :          ├── ArrayRecord (97.73 MB)
INFO :          ├── ConfigRecord (train): {'lr': 0.01}
INFO :          ├── ConfigRecord (evaluate): (empty!)
INFO :          ├──> Sampling:
INFO :          │       ├──Fraction: train (0.50) | evaluate ( 1.00)
INFO :          │       ├──Minimum nodes: train (2) | evaluate (2)
INFO :          │       └──Minimum available nodes: 2
INFO :          └──> Keys in records:
INFO :                  ├── Weighted by: 'num-examples'
INFO :                  ├── ArrayRecord key: 'arrays'
INFO :                  └── ConfigRecord key: 'config'
INFO :
INFO :
INFO :      [ROUND 1/3]
INFO :      configure_train: Sampled 3 nodes (out of 5)
INFO :      aggregate_train: Received 3 results and 0 failures
INFO :          └──> Aggregated MetricRecord: {'train_loss': 19.972134065628055}
INFO :      configure_evaluate: Sampled 5 nodes (out of 5)
INFO :      aggregate_evaluate: Received 5 results and 0 failures
INFO :          └──> Aggregated MetricRecord: {'eval_loss': 24157916.1, 'eval_acc': 0.0}
INFO :
INFO :      [ROUND 2/3]
INFO :      configure_train: Sampled 3 nodes (out of 5)
INFO :      aggregate_train: Received 3 results and 0 failures
INFO :          └──> Aggregated MetricRecord: {'train_loss': 14.02075376510621}
INFO :      configure_evaluate: Sampled 5 nodes (out of 5)
INFO :      aggregate_evaluate: Received 5 results and 0 failures
INFO :          └──> Aggregated MetricRecord: {'eval_loss': 14521418.0, 'eval_acc': 0.0}
INFO :
INFO :      [ROUND 3/3]
INFO :      configure_train: Sampled 3 nodes (out of 5)
INFO :      aggregate_train: Received 3 results and 0 failures
INFO :          └──> Aggregated MetricRecord: {'train_loss': 10.710087394714355}
INFO :      configure_evaluate: Sampled 5 nodes (out of 5)
INFO :      aggregate_evaluate: Received 5 results and 0 failures
INFO :          └──> Aggregated MetricRecord: {'eval_loss': 5864870.0, 'eval_acc': 0.0}
INFO : 
INFO :      Strategy execution finished in 41.42s
INFO :
INFO :      Final results:
INFO :
INFO :          Global Arrays:
INFO :                  ArrayRecord (97.734 MB)
INFO :
INFO :          Aggregated ClientApp-side Train Metrics:
INFO :          { 1: {'train_loss': '1.9972e+01'},
INFO :            2: {'train_loss': '1.4021e+01'},
INFO :            3: {'train_loss': '1.0074e+01'}}
INFO :
INFO :          Aggregated ClientApp-side Evaluate Metrics:
INFO :          { 1: {'eval_acc': '0.0000e+00', 'eval_loss': '2.4158e+07'},
INFO :            2: {'eval_acc': '0.0000e+00', 'eval_loss': '1.4521e+07'},
INFO :            3: {'eval_acc': '0.0000e+00', 'eval_loss': '5.5865e+06'}}
INFO :
INFO :          ServerApp-side Evaluate Metrics:
INFO :          {}

Override Run Config

You can also override the settings for your ClientApp and ServerApp defined in the [tool.flwr.app.config] section of the pyproject.toml. This can be done by extending the list of arguments passed via the --run-config argument to flwr run. For example:

# Run for 5 rounds
flwr run . --run-config="wandb-token='<YOUR-WANDB-TOKEN' num-server-rounds=5" --stream

# Run for 5 rounds using the ResNet-101 model
flwr run . --run-config="wandb-token='<YOUR-WANDB-TOKEN' num-server-rounds=5 model-name=resnet101" --stream