Custom Metrics for Federated Learning with TensorFlow and FlowerΒΆ
This simple example demonstrates how to calculate custom metrics over multiple clients beyond the traditional ones available in the ML frameworks. In this case, it demonstrates the use of ready-available scikit-learn metrics: accuracy, recall, precision, and f1-score.
Once both the test values (y_test
) and the predictions (y_pred
) are available on the client side (client_app.py
), other metrics or custom ones are possible to be calculated.
The main takeaways of this implementation are:
the return of multiple evaluation metrics generated at the
evaluate
method onclient_app.py
the use of the
evaluate_metrics_aggregation_fn
- to aggregate the metrics on the server side, part of thestrategy
onserver_app.py
This example is based on the quickstart-tensorflow
with CIFAR-10, source here, with the addition of Flower Datasets to retrieve the CIFAR-10.
Using the CIFAR-10 dataset for classification, this is a multi-class classification problem, thus some changes on how to calculate the metrics using average='micro'
and np.argmax
is required. For binary classification, this is not required. Also, for unsupervised learning tasks, such as using a deep autoencoder, a custom metric based on reconstruction error could be implemented on client side.
Set up the projectΒΆ
Clone the projectΒΆ
Start by cloning the example project:
git clone --depth=1 https://github.com/adap/flower.git _tmp \
&& mv _tmp/examples/custom-metrics . \
&& rm -rf _tmp && cd custom-metrics
This will create a new directory called custom-metrics
containing the
following files:
custom-metrics
βββ README.md
βββ custommetrics_example
β βββ __init__.py
β βββ client_app.py # Defines your ClientApp
β βββ server_app.py # Defines your ServerApp
β βββ task.py # Defines your model and dataloading functions
βββ pyproject.toml # Project metadata like dependencies and configs
Install dependencies and projectΒΆ
Install the dependencies defined in pyproject.toml
as well as the custommetrics_example
package.
pip install -e .
Run the ExampleΒΆ
You can run your Flower project in both simulation and deployment mode without making changes to the code. If you are starting with Flower, we recommend you using the simulation mode as it requires fewer components to be launched manually. By default, flwr run
will make use of the Simulation Engine.
Run with the Simulation EngineΒΆ
flwr run .
You can also override some of the settings for your ClientApp
and ServerApp
defined in pyproject.toml
. For example:
flwr run . --run-config num-server-rounds=5
Run with the Deployment EngineΒΆ
Follow this how-to guide to run the same app in this example but with Flowerβs Deployment Engine. After that, you might be intersted in setting up secure TLS-enabled communications and SuperNode authentication in your federation.
If you are already familiar with how the Deployment Engine works, you may want to learn how to run it using Docker. Check out the Flower with Docker documentation.