Custom Metrics for Federated Learning with TensorFlow and FlowerΒΆ
This simple example demonstrates how to calculate custom metrics over multiple clients beyond the traditional ones available in the ML frameworks. In this case, it demonstrates the use of ready-available scikit-learn metrics: accuracy, recall, precision, and f1-score.
Once both the test values (y_test
) and the predictions (y_pred
) are available on the client side (client_app.py
), other metrics or custom ones are possible to be calculated.
The main takeaways of this implementation are:
the return of multiple evaluation metrics generated at the
evaluate
method onclient_app.py
the use of the
evaluate_metrics_aggregation_fn
- to aggregate the metrics on the server side, part of thestrategy
onserver_app.py
This example is based on the quickstart-tensorflow
with CIFAR-10, source here, with the addition of Flower Datasets to retrieve the CIFAR-10.
Using the CIFAR-10 dataset for classification, this is a multi-class classification problem, thus some changes on how to calculate the metrics using average='micro'
and np.argmax
is required. For binary classification, this is not required. Also, for unsupervised learning tasks, such as using a deep autoencoder, a custom metric based on reconstruction error could be implemented on client side.
Set up the projectΒΆ
Clone the projectΒΆ
Start by cloning the example project:
git clone --depth=1 https://github.com/adap/flower.git _tmp \
&& mv _tmp/examples/custom-metrics . \
&& rm -rf _tmp && cd custom-metrics
This will create a new directory called custom-metrics
containing the
following files:
custom-metrics
βββ README.md
βββ custommetrics_example
β βββ __init__.py
β βββ client_app.py # Defines your ClientApp
β βββ server_app.py # Defines your ServerApp
β βββ task.py # Defines your model and dataloading functions
βββ pyproject.toml # Project metadata like dependencies and configs
Install dependencies and projectΒΆ
Install the dependencies defined in pyproject.toml
as well as the custommetrics_example
package.
pip install -e .
Run the ExampleΒΆ
You can run your Flower project in both simulation and deployment mode without making changes to the code. If you are starting with Flower, we recommend you using the simulation mode as it requires fewer components to be launched manually. By default, flwr run
will make use of the Simulation Engine.
Run with the Simulation EngineΒΆ
flwr run .
You can also override some of the settings for your ClientApp
and ServerApp
defined in pyproject.toml
. For example:
flwr run . --run-config num-server-rounds=5
Run with the Deployment EngineΒΆ
[!NOTE] An update to this example will show how to run this Flower application with the Deployment Engine and TLS certificates, or with Docker.