@gfedops/fedops-mnist-xai

FedOps MNIST Client App (FlowerHub)

This repository provides a FedOps Client App for MNIST training in a federated setting. It is intended for real client participation (laptop/PC/server/edge nodes), not only single-machine simulation.

In other words, this project is the client-side application that joins a FedOps-managed FL lifecycle. It also supports optional XAI-based interpretation, such as Grad-CAM, to help visualize and explain client-side model predictions.

What Is FedOps?

FedOps (Federated Learning Lifecycle Operations Management Platform) is a platform for operating federated learning end-to-end in production-like environments:

migrate existing AI/ML workflows into FL with minimal friction
scale participation across many clients in an MLOps-like manner
continuously manage and monitor the full FL lifecycle

FedOps official site: https://ccl.gachon.ac.kr/fedops

What This App Is (and Is Not)

This repository (fedops-mnist-xai) is a FedOps Client App.

It runs local training on client-side data.
It sends updates to an FL server.
It receives aggregated global updates for the next round.
In addition to standard federated learning, this version also supports XAI (Explainable AI) using Grad-CAM. When enabled, the client can generate visual explanations for model predictions on sample test images before or during federated execution, depending on the configuration.

It does not create the FedOps FL server itself.

To set up the FedOps FL server first, follow: https://gachon-cclab.github.io/docs/FedOps-Tutorials/Create-FL-Server/

Operational Capabilities in the FedOps Context

FedOps supports real-device FL operations with platform-level capabilities such as:

FLScalize: adapt existing models/data into FL client/server workflows
Manager: monitor and manage client/server execution states
CE/CS (Contribution Evaluation / Client Selection): contribution-aware client selection logic
CI/CD/CFL: Git-based deployment and continuous/cyclic federated workflows
Monitoring Dashboard: lifecycle visibility across run states, logs, and rounds

How This Client Runs

When you run this app with Flower (flwr run .), it launches two FedOps-side client processes:

local training client (fedopsmnist.client_main)
communication manager (fedopsmnist.client_manager_main)

Together, they participate in the practical FL cycle:

Train on local client data
Send model updates to the FL server
Receive aggregated global model updates
Repeat for the next round

Installation

pip install -e .

Run

Default run:

flwr run .

Run with a specific FedOps task ID:

flwr run . --run-config 'task_id="<YOUR_TASK_ID>"'

task_id from Flower run config is injected into the FedOps client configuration at startup.

Project Structure

fedops-mnist-xai
├── fedopsmnist
│   ├── __init__.py
│   ├── client_app.py          # Placeholder ClientApp for Flower app validation
│   ├── launcher_app.py        # Flower ServerApp entrypoint (process launcher)
│   ├── client_main.py         # FedOps local training client
│   ├── client_manager_main.py # FedOps communication manager (FastAPI)
│   ├── data_preparation.py    # MNIST data loading
│   ├── models.py              # MNIST model/train/eval helpers
│   ├── xai_utils.py           # XAI(Grad-CAM) utilities for model explanation
│   └── conf/config.toml       # FedOps client configuration
├── pyproject.toml
└── README.md

Features

Federated learning client for MNIST
Local model initialization and checkpoint restore
FedOps task-based client execution
Config-driven runtime using config.toml
Optional Grad-CAM-based XAI visualization
Configurable XAI output directory and target layer

Example XAI Configuration

Below is an example configuration for running a simple federated MNIST experiment with PyTorch, while enabling Grad-CAM-based XAI on the client side.

random_seed = 42
learning_rate = 0.001
model_type = "Pytorch"
task_id = "task_id"
num_epochs = 1
batch_size = 128
num_rounds = 2
clients_per_round = 1

[model]
_target_ = "fedopsmnist.models.MNISTClassifier"
output_size = 10

[dataset]
name = "MNIST"
validation_split = 0.2

[xai]
enabled = true
run_location = "client"
target_layer_index = -1
output_dir = "outputs/gradcam"
layer = "conv2"

[wandb]
use = false
key = "your wandb api key"
account = "your wandb account"
project = "MNIST_task_id"

[server.strategy]
_target_ = "flwr.server.strategy.FedAvg"
fraction_fit = 0.00001
fraction_evaluate = 0.000001
min_fit_clients = 1
min_available_clients = 1
min_evaluate_clients = 1

Parameters

enabled

Enables or disables XAI execution.
run_location

Specifies where XAI should run.

Current recommended value: "client".
target_layer_index

Index of the target layer for Grad-CAM.
- 1 usually means the last eligible convolutional layer.
output_dir

Directory where Grad-CAM results are saved.
layer

Name of the target layer in the model.

For the default MNIST CNN, this is typically "conv2".

Important note

The configured layer must match the actual model layer name in models.py.

If the target layer does not exist, Grad-CAM generation will fail.

Notes

Network connectivity to the target FedOps FL server is required.
In local (non-Docker) execution, the launcher uses environment-based task ID injection so client-manager communication remains on localhost.
Grad-CAM is the current XAI method supported in this project, and we plan to extend it with additional explainability methods in future development.