@christofilojohn/feature-election
Feature Election - Federated Feature Selection with Flower
Quickstart
flwr new @christofilojohn/feature-electionReadme
Feature Election: Federated feature selection with Flower
A federated feature selection framework for tabular datasets.
Overview
Feature Election enables multiple clients with tabular datasets to collaboratively identify the most relevant features without sharing raw data. This work originates from FLASH: A framework for Federated Learning with Attribute Selection and Hyperparameter optimization, presented at FLTA IEEE 2025 (Best Student Paper Award).
Key Features:
- Privacy-preserving: Clients only share feature selections and scores, not raw data
- Multiple FS methods: Lasso, Random Forest, Mutual Information, RFE, and more
- Configurable aggregation: Control the balance between intersection and union of features
Citation
If you use Feature Election in your research, please cite the FLASH framework paper:
IEEE Style:
I. Christofilogiannis, G. Valavanis, A. Shevtsov, I. Lamprou and S. Ioannidis, "FLASH: A Framework for Federated Learning with Attribute Selection and Hyperparameter Optimization," 2025 3rd International Conference on Federated Learning Technologies and Applications (FLTA), Dubrovnik, Croatia, 2025, pp. 93-100, doi: 10.1109/FLTA67013.2025.11336571.
BibTeX:
@INPROCEEDINGS{11336571, author={Christofilogiannis, Ioannis and Valavanis, Georgios and Shevtsov, Alexander and Lamprou, Ioannis and Ioannidis, Sotiris}, booktitle={2025 3rd International Conference on Federated Learning Technologies and Applications (FLTA)}, title={FLASH: A Framework for Federated Learning with Attribute Selection and Hyperparameter Optimization}, year={2025}, pages={93-100}, doi={10.1109/FLTA67013.2025.11336571} }
Feature Selection Methods
| Method | Description | Speed |
|---|---|---|
| lasso | L1-regularized regression (sparse) | Fast |
| elastic_net | Elastic Net regularization | Fast |
| random_forest | Random Forest importance | Medium |
| mutual_info | Mutual information | Medium |
| f_classif | F-statistic | Fast |
| chi2 | Chi-squared test | Fast |
| rfe | Recursive Feature Elimination | Slow |
| pyimpetus | PyImpetus Markov Blanket | Slow |
Key Parameters
Freedom Degree (alpha)
Controls feature selection strategy (alpha in [0,1]):
- alpha = 0.0: Intersection - only features selected by ALL clients
- alpha = 1.0: Union - features selected by ANY client
- alpha = 0.5: Balanced selection (recommended)
Aggregation Mode
- weighted: Weight client contributions by sample count (recommended)
- uniform: Equal weight for all clients
Project Structure
feature-election/ ├── feature_election/ # Package directory │ ├── __init__.py │ ├── client_app.py # ClientApp with @app.train() and @app.evaluate() │ ├── server_app.py # ServerApp with @app.main() │ ├── strategy.py # Feature Election strategy │ ├── feature_election_utils.py # Feature selection methods │ └── task.py # Data loading utilities ├── pyproject.toml # Configuration and dependencies ├── README.md └── test.py # Quick verification script
Installation
# Clone the repository git clone --depth=1 https://github.com/adap/flower.git && mv flower/examples/feature-election . && rm -rf flower && feature-election # Install dependencies pip install -e .
Running the Project
Quick Verification
python test.py
Simulation (CPU)
Run with default parameters:
flwr run .
Simulation (GPU)
If you have a GPU available:
flwr run . local-simulation-gpu
Custom Configuration
Override parameters at runtime:
flwr run . --run-config "freedom-degree=0.3 fs-method='random_forest'"
Or edit pyproject.toml:
[tool.flwr.app.config] freedom-degree = 0.3 aggregation-mode = "weighted" fs-method = "random_forest" num-rounds = 1
Run with the Deployment Engine
Follow this how-to guide to run the same app in this example but with Flower's Deployment Engine. After that, you might be intersted in setting up secure TLS-enabled communications and SuperNode authentication in your federation.
If you are already familiar with how the Deployment Engine works, you may want to learn how to run it using Docker. Check out the Flower with Docker documentation.
Configuration Reference
Feature Election Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| freedom-degree | float | 0.6 | Selection strategy (0=intersection, 1=union) |
| aggregation-mode | str | "weighted" | "weighted" or "uniform" |
| fs-method | str | "mutual_info" | Feature selection method |
| eval-metric | str | "f1" | Evaluation metric ("f1", "accuracy", "auc") |
Federated Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| num-rounds | int | 15 | Number of total FL rounds, including tuning and aggreegation |
| fraction-train | float | 1.0 | Fraction of nodes for training |
| fraction-evaluate | float | 1.0 | Fraction of nodes for evaluation |
Results
After running, results are saved to outputs/<date>/<time>/:
feature_election_results.json
{ "global_feature_mask": [true, false, true, ...], "feature_selection_with_names": { "feature_000": true, "feature_001": false, "feature_002": true }, "selected_feature_names": ["feature_000", "feature_002", ...], "election_stats": { "num_clients": 10, "num_features_original": 100, "num_features_selected": 35, "reduction_ratio": 0.65, "freedom_degree": 0.5, "intersection_features": 15, "union_features": 50 }, "tuning_history": [ {"freedom_degree": 0.5, "score": 0.82, "num_features_selected": 30}, {"freedom_degree": 0.6, "score": 0.85, "num_features_selected": 35} ], "total_bytes_transmitted": 123456 }
client_feature_selections.json
{ "client_1": { "num_samples": 800, "num_features_selected": 25, "selected_feature_names": ["feature_000", "feature_003", ...], "all_features": { "feature_000": {"selected": true, "score": 0.95}, "feature_001": {"selected": false, "score": 0.12} } }, "_summary": { "total_clients": 2, "features_selected_by_all": ["feature_000", "feature_005"], "features_selected_by_any": ["feature_000", "feature_002", "feature_003"], "num_intersection": 18, "num_union": 35 } }
Algorithm
- Client Selection: Each client performs local feature selection
- Score Calculation: Clients compute feature importance scores
- Submission: Clients send binary masks and scores (not raw data)
- Aggregation: Server aggregates using weighted voting based on freedom_degree
- Distribution: Server broadcasts global mask to clients
License
Licensed under the Apache License, Version 2.0.