FlowerTune LLM on Medical Dataset¶

View on GitHub

This directory conducts federated instruction tuning with a pretrained Mistral-7B model on a Medical dataset. We use Flower Datasets to download, partition and preprocess the dataset. Flower’s Simulation Engine is used to simulate the LLM fine-tuning process in federated way, which allows users to perform the training on a single GPU.

Methodology¶

This baseline performs federated LLM fine-tuning with LoRA using the 🤗PEFT library. The clients’ models are aggregated with FedAvg strategy. This provides a baseline performance for the leaderboard of Medical challenge.

Fetch the app¶

Install Flower:

pip install flwr

Fetch the app:

flwr new @flwrlabs/flowertune-llm-medical

Environments setup¶

Project dependencies are defined in pyproject.toml. Install them in an activated Python environment with:

pip install -e .

Tip: Learn how to configure your pyproject.toml file for Flower apps in this guide.

Experimental setup¶

The dataset is divided into 20 partitions in an IID fashion, a partition is assigned to each ClientApp. We randomly sample a fraction (0.1) of the total nodes to participate in each round, for a total of 200 rounds. All the Flower App settings are defined in pyproject.toml.

Before proceeding you need to create a new SuperLink connection and define 20 virtual SuperNodes. To do this, let’s first locate the Flower Configuration file and then edit it.

Locate the Flower Configuration file:

flwr config list
# Example output:
Flower Config file: /path/to/your/.flwr/config.toml
SuperLink connections:
 supergrid
 local (default)

Add a new connection named flowertune and make it the default.

[superlink.flowertune]
options.num-supernodes = 20
options.backend.client-resources.num-cpus = 6
options.backend.client-resources.num-gpus = 1.0

[!IMPORTANT] Please note that [tool.flwr.app.config.static] are not allowed to be modified for fair competition if you plan to participated in the LLM leaderboard. Additionally, the number of supernodes (i.e. options.num-supernodes) must be 20.

Running the challenge¶

First make sure that you have got the access to Mistral-7B model with your Hugging-Face account. You can request access directly from the Hugging-Face website. Then, follow the instruction here to log in your account. Note you only need to complete this stage once in your development machine:

hf auth login

Run the challenge with default config values. The configs are defined in [tool.flwr.app.config] entry of pyproject.toml, and are loaded automatically.

flwr run

VRAM consumption¶

We use Mistral-7B model with 4-bit quantization as default. The estimated VRAM consumption per client for each challenge is shown below:

Challenges

GeneralNLP

Finance

Medical

Code

VRAM

~25.50 GB

~17.30 GB

~22.80 GB

~17.40 GB

You can adjust the CPU/GPU resources you assign to each of the clients based on your device, which are specified with options.backend.client-resources.num-cpus and options.backend.client-resources.num-gpus in your flowertune connection in your config.toml.

Model saving¶

The global PEFT model checkpoints are saved every 5 rounds after aggregation on the sever side as default, which can be specified with train.save-every-round under [tool.flwr.app.config] entry in pyproject.toml.

[!NOTE] Please provide the last PEFT checkpoint if you plan to participated in the LLM leaderboard.