ExponentialPartitioner

class ExponentialPartitioner(num_partitions: int)[source]

Bases: IdToSizeFncPartitioner

Partitioner creates partitions of size that are correlated with exp(id).

The amount of data each client gets is correlated with the exponent of partition ID. For instance, if the IDs range from 1 to M, client with ID 1 gets e units of data, client 2 gets e^2 units, and so on, up to client M which gets e^M units. The floor operation is applied on each of these numbers, it means floor(2.71…) = 2; e^2 ~ 7.39 floor(7.39) = 7. The number is rounded down = the fraction is always cut. The remainders of theses unassigned (fraction) samples is added to the biggest partition (the one with the biggest partition_id).

Parameters:

num_partitions (int) – The total number of partitions that the data will be divided into.

Examples

>>> from flwr_datasets import FederatedDataset
>>> from flwr_datasets.partitioner import ExponentialPartitioner
>>>
>>> partitioner = ExponentialPartitioner(num_partitions=10)
>>> fds = FederatedDataset(dataset="mnist", partitioners={"train": partitioner})
>>> partition = fds.load_partition(0)

Methods

is_dataset_assigned()

Check if a dataset has been assigned to the partitioner.

load_partition(partition_id)

Load a single partition based on the partition index.

Attributes

dataset

Dataset property.

num_partitions

Total number of partitions.

partition_id_to_indices

Node id to the list of indices.

partition_id_to_size

Node id to the number of samples.

property dataset: Dataset

Dataset property.

is_dataset_assigned() bool

Check if a dataset has been assigned to the partitioner.

This method returns True if a dataset is already set for the partitioner, otherwise, it returns False.

Returns:

dataset_assigned – True if a dataset is assigned, otherwise False.

Return type:

bool

load_partition(partition_id: int) Dataset

Load a single partition based on the partition index.

The number of samples is dependent on the partition partition_id.

Parameters:

partition_id (int) – the index that corresponds to the requested partition

Returns:

dataset_partition – single dataset partition

Return type:

Dataset

property num_partitions: int

Total number of partitions.

property partition_id_to_indices: dict[int, list[int]]

Node id to the list of indices.

property partition_id_to_size: dict[int, int]

Node id to the number of samples.