📆 Thursday, August 29, 10:00 AM - 1:00 PM

📍 Centre de Convencions Internacional de Barcelona

KDD 2024:

Privacy-Preserving Federated Learning using Flower Framework

Flower in KDD 2024

AI projects often face the challenge of limited access to meaningful amounts of training data. In traditional approaches, collecting data in a central location can be problematic, especially in industry settings with sensitive and distributed data. However, there is a solution - moving the computation to the data through Federated Learning.

Federated Learning, a distributed machine learning approach, offers a promising solution by enabling model training across devices. It is a data minimization approach where direct access to data is not required. Furthermore, federated learning can be combined with techniques like differential privacy, secure aggregation, homomorphic encryption, and others, to further enhance privacy protection. In this hands-on tutorial, we delve into the realm of privacy-preserving machine learning using federated learning, leveraging the Flower framework which is specifically designed to simplify the process of building federated learning systems, as our primary tool.

Moreover, we present the foundations of federated learning, explore how different techniques can enhance its privacy aspects, how it is being used in real-world settings today and a series of practical, hands-on code examples that showcase how you can federate any AI project with Flower, an open-source framework for all-this federated.

Target Audience and Prerequisites:

This tutorial is suitable for researchers, machine learning practitioners, data scientists and developers interested in privacy-preserving machine learning techniques. Basic knowledge of machine learning concepts and Python programming is recommended. No prior experience with federated learning or Flower framework is required.

Meet the tutors

  • Mohammad Naseri

    Research Scientist


    Mohammad focused on the privacy and security aspects of Flower framework. He recently completed his Ph.D. at University College London (UCL). His research primarily revolves around the field of security and privacy in machine learning, with a particular focus on federated learning. During his Ph.D. journey, Mohammad has completed research internships at Microsoft Research and Telefonica. His work has been published in venues like IEEE S&P, CCS, NDSS, ICML, and PETs.

  • Javier Fernandez

    Lead Research Scientist


    Javier works on the core framework and develops the Flower Simulation Engine, which allows to run Federated Learning workloads in a resource-aware manner and scale these to thousands of active clients. Javier interests lie in the intersection of Machine Learning and Systems, and more concretely running on-device ML workloads, a key component in Federated Learning. Javier got his PhD in Computer Science from the University of Oxford in 2021. Before joining Flower Labs, he worked as a research scientist at Samsung AI (Cambridge, UK)

  • Heng Pan

    Research Scientist


    Pan specializes in federated learning and the integration of secure aggregation functionalities into the Flower framework. He holds a masters degree from the university of Cambridge, where he collaborated with Prof. Nic Lane. He was part of the team from university of Cambridge that won the first prize in the UK-US PETs Prize Challenge for creating a privacy-centric solution to detect anomalies in the SWIFT network. His expertise lies at the intersection of machine learning, federated learning, and data security.

  • Yan Gao

    Research Scientist


    Yan focused is at the forefront of federated learning innovation with different types of models, including XGBoost, LLMs, etc. Prior to this role, he completed his PhD at the University of Cambridge within the Machine Learning System Lab. His research interests include machine learning, federated learning, self-supervised learning, and optimisation techniques. Throughout his doctoral studies, he focused on pioneering research in federated self-supervised learning, specifically targeting the challenge of working with unlabelled data across diverse domains such as audio, image, and video. This groundbreaking work has been recognised and published in several top-tier international conferences and journals, including ICCV, ECCV, ICLR, INTERSPEECH, ICCASP, and JMLR, marking significant contributions to the field of federated learning and its applications.

Tutorial outline

Introduction to Federated Learning (30 mins)
Challenges of centralized learning
  • Introduce the limitations of traditional centralized machine learning approaches, such as data privacy concerns, data silos, and scalability issues.
Core concepts of federated learning
  • Define federated learning and its key components, including client devices, server aggregator, and global model.
  • Explain the federated learning workflow, including model initialization, client updates, aggregation, and model updating.
Model aggregation strategies
  • Explain and compare common aggregation strategies
Implementing Federated Learning with Flower (30 mins)
  • Step-by-step federated learning environment using the Flower framework
  • Live demo: Implementing a simple task for image classification.
Flower Datasets (25 mins)
  • Introduce Flower Datasets library to create datasets for federated learning
  • Live demo: Present different approaches for partitioning.
Privacy and Security Aspects of Federated Learning (45 mins)
Differential Privacy (DP) Introduction
  • Introduce the concept of differential privacy and its relevance to federated learning.
  • Discuss mechanisms for incorporating differential privacy into federated learning.
Secure Aggregation (SecAgg) Introduction:
  • Explain the importance of secure aggregation in federated learning
  • Explore cryptographic techniques for secure aggregation, including homomorphic encryption and secure multi-party computation.
DP and SecAgg in Federated Learning
  • Live demo: integrate DP and SecAgg using Flower
LLM training using FL in Flower (20 mins)
  • Overview of Language Model Training
  • Live demo: Hands-on Session with Flower
Other advanced topics in Federated Learning (15 mins)
  • Heterogeneous clients, underlying data distributions, communication overheads, high degree of parallelism
Q&A and Wrap-up (15 mins)

Join our channels for more event details and answers to all your questions!