Flower AI Summit 2026Β·April 15–16Β·London

@bahaaelden/subgroup-seq-agg

1
0
flwr new @bahaaelden/subgroup-seq-agg

Sequential Subgroup Aggregation in Federated Learning

A Flower Federated Learning application where clients train in sequential phases, allowing overlapping clients to carry learned knowledge between groups.

Author: Bahaa-Elden Ali Abdelghany


πŸ’‘ The Core Concept

In standard Federated Learning (FL), clients typically train together in a single global aggregation process where updates from many clients are combined into one global model.

This project introduces Sequential Grouping.

Instead of training all clients simultaneously, we divide them into phased groups that train one after another.

  • Phase 1 β†’ Group 1 trains
  • Phase 2 β†’ Group 2 trains
  • Phase 3 β†’ Group 3 trains

Each group may use a different federated aggregation strategy.

Example:

  • Group 1 β†’ FedAvg
  • Group 2 β†’ FedProx
  • Group 3 β†’ QFedAvg

The model produced by one group becomes the starting baseline for the next group, forming a training pipeline across client groups.


πŸŒ‰ Bridge Nodes (Knowledge Carry-Over)

The most important feature of this project is how it handles clients that belong to multiple groups.

These clients act as bridges, transferring knowledge between training phases.

Example

Suppose Node 4 belongs to Group 1 and Group 3.

Phase 1
Group 1 β†’ Nodes: 1,2,3,4
Result β†’ Model M1
Node 4 participates in generating M1.

Phase 2
Group 2 β†’ Nodes: 5,6
Result β†’ Model M2
Node 4 does not participate in this phase.

Phase 3
Group 3 β†’ Nodes: 4,7
Since Node 4 trained in Phase 1, it carries its learned parameters forward.
Node 4 therefore initializes the new group with its previously trained model, effectively bridging knowledge between phases.
Node 7, which is new to the pipeline, begins training from Node 4’s parameters.

If multiple bridge nodes exist, the system uses the model from the most recently trained bridge node as the initialization seed for the new group.

In this way, overlapping nodes propagate knowledge across sequential training phases.


🎯 Use Cases

Sequential subgroup aggregation is useful when clients or data sources are heterogeneous and cannot all participate in the same training phase.

Multi-Organization Federated Learning

  • Phase 1 β†’ Hospital Network A
  • Phase 2 β†’ Hospital Network B
  • Phase 3 β†’ Hospital Network C

Bridge nodes transfer knowledge between organizations without sharing raw data.

Multimodal Federated Learning

  • Group 1 β†’ Vision clients
  • Group 2 β†’ Audio clients
  • Group 3 β†’ Multimodal clients

Clients that support multiple modalities naturally act as bridges between training stages.

Edge–Cloud Hierarchical Training

  • Phase 1 β†’ Edge devices
  • Phase 2 β†’ Regional aggregators
  • Phase 3 β†’ Global aggregation

Bridge nodes propagate learned representations between levels.

Resource-Constrained Federated Systems

In very large deployments, not all devices can participate simultaneously.

Sequential grouping allows:

  • controlled scheduling of clients
  • reduced communication load
  • scalable training phases

Strategy Experimentation

Researchers can evaluate different aggregation strategies sequentially.

  • Group 1 β†’ FedAvg
  • Group 2 β†’ FedProx
  • Group 3 β†’ QFedAvg

This makes it possible to observe how earlier strategies influence downstream training.


πŸš€ Quick Start

  1. Install Dependencies

    pip install flwr[simulation] flwr-datasets[vision] torch torchvision numpy

    Or install locally:

    pip install .
  2. Run the Simulation

    flwr run .

βš™οΈ Configuration (No Python Required)

The entire training pipeline is controlled through pyproject.toml.

No Python code needs to be modified.

Under the [tool.flwr.app.config] section you define your sequential training groups.

Example: Two-Stage Training Pipeline

[tool.flwr.app.config]
dataset = "mnist"
num-groups = 2
rounds-per-group = 2

# --- Group 1 (Trains First) ---
group-1-name = "Group 1"
group-1-nodes = "1,2,3,4"
group-1-strategy = "QFedAvg"
group-1-carry-over = true

# --- Group 2 (Trains Second) ---
group-2-name = "Group 2"
group-2-nodes = "4,5,6"
group-2-strategy = "FedProx"
group-2-strategy-params = '{"proximal_mu": 0.1}'
group-2-carry-over = true

⚠️ For the bridge mechanism to work, carry-over must be enabled.


πŸ”§ Supported Aggregation Strategies

The following strategies are tested and supported:

  • FedAvg
  • FedMedian
  • FedAdam
  • FedAdagrad
  • FedYogi

Some strategies require patched implementations to ensure compatibility with NumPy-based parameter operations.


πŸ” Conditionally Supported Strategies

The following strategies may be available depending on the Flower version and installed dependencies:

  • FedProx
  • FedAvgM
  • FedTrimmedAvg
  • Krum
  • MultiKrum
  • Bulyan
  • QFedAvg
  • FedXgbBagging
  • FedXgbCyclic

These strategies are loaded using a try/except compatibility mechanism. If a strategy is unavailable in the current Flower environment, it will not be registered.


πŸ—‚οΈ Project Structure

subgroup_sequential_aggregation/
β”‚
β”œβ”€β”€ server_app.py
β”œβ”€β”€ client_app.py
β”œβ”€β”€ hierarchical_strategy.py
β”œβ”€β”€ group.py
β”œβ”€β”€ patched_strategies.py
β”œβ”€β”€ task.py
└── pyproject.toml

File Roles

  • pyproject.toml β€” Defines sequential groups and aggregation strategies.
  • server_app.py β€” Runs the Flower server and registers the sequential aggregation pipeline.
  • client_app.py β€” Defines the training logic executed by each simulated client node.
  • task.py β€” Contains the PyTorch model and dataset loading logic.
  • hierarchical_strategy.py β€” Core engine managing sequential phases and bridge node carry-over.
  • group.py β€” Defines group layers and assigns nodes to strategies.
  • patched_strategies.py β€” Provides compatibility fixes for certain Flower strategies.

πŸ“Š Outputs

After training completes, results are saved in the results/ directory.

Example:

results/
β”œβ”€β”€ metrics_history.json
β”œβ”€β”€ final_model.npz
β”œβ”€β”€ model_group_1.npz
└── model_group_2.npz

Output Files

  • metrics_history.json β€” Complete training metrics across all rounds and groups.
  • final_model.npz β€” The final global model produced after the last group.
  • model_group_X.npz β€” Snapshots of the model at the end of each group phase.

⚠️ Limitations

While Sequential Subgroup Aggregation enables flexible phased training, several limitations exist.

Sequential Execution
Groups train strictly one after another, which may increase total training time compared to fully parallel federated learning.

Bridge Node Dependency
Knowledge transfer between groups requires overlapping clients. If no bridge nodes exist between two groups, the next group starts from the latest global model.

Strategy Compatibility
Not all Flower strategies are fully supported. Some require patched implementations or depend on the Flower version.

Simulation-Focused Design
The current project is primarily designed for Flower simulation environments and may require additional infrastructure for large-scale production deployments.


πŸ“š Citation

If you use this project in your research, please cite:

@INPROCEEDINGS{10206255,
  author={Abdelghany, Bahaa-Elden A. and FernΓ‘ndez-Veiga, M. and FernΓ‘ndez-Vilas, A. and Hassan, Ammar M. and Abdelmoez, Walid M. and El-Bendary, Nashwa},
  booktitle={2022 32nd International Conference on Computer Theory and Applications (ICCTA)}, 
  title={Scheduling and Communication Schemes for Decentralized Federated Learning}, 
  year={2022},
  doi={10.1109/ICCTA58027.2022.10206255}
}