ExponentialPartitioner¶

class ExponentialPartitioner(num_partitions: int)[source]¶

Bases: IdToSizeFncPartitioner

Partitioner creates partitions of size that are correlated with exp(id).

The amount of data each client gets is correlated with the exponent of partition ID. For instance, if the IDs range from 1 to M, client with ID 1 gets e units of data, client 2 gets e^2 units, and so on, up to client M which gets e^M units. The floor operation is applied on each of these numbers, it means floor(2.71…) = 2; e^2 ~ 7.39 floor(7.39) = 7. The number is rounded down = the fraction is always cut. The remainders of theses unassigned (fraction) samples is added to the biggest partition (the one with the biggest partition_id).

Parameters:: num_partitions (int) – The total number of partitions that the data will be divided into.

Examples

>>> from flwr_datasets import FederatedDataset
>>> from flwr_datasets.partitioner import ExponentialPartitioner
>>>
>>> partitioner = ExponentialPartitioner(num_partitions=10)
>>> fds = FederatedDataset(dataset="mnist", partitioners={"train": partitioner})
>>> partition = fds.load_partition(0)

Methods

`is_dataset_assigned`()	Check if a dataset has been assigned to the partitioner.
`load_partition`(partition_id)	Load a single partition based on the partition index.

Attributes

`dataset`	Dataset property.
`num_partitions`	Total number of partitions.
`partition_id_to_indices`	Node id to the list of indices.
`partition_id_to_size`	Node id to the number of samples.

property dataset: Dataset¶: Dataset property.

is_dataset_assigned() → bool¶

Check if a dataset has been assigned to the partitioner.

This method returns True if a dataset is already set for the partitioner, otherwise, it returns False.

Returns:: dataset_assigned – True if a dataset is assigned, otherwise False.
Return type:: bool

load_partition(partition_id: int) → Dataset¶

Load a single partition based on the partition index.

The number of samples is dependent on the partition partition_id.

Parameters:: partition_id (int) – the index that corresponds to the requested partition
Returns:: dataset_partition – single dataset partition
Return type:: Dataset

property num_partitions: int¶: Total number of partitions.

property partition_id_to_indices: dict[int, list[int]]¶: Node id to the list of indices.

property partition_id_to_size: dict[int, int]¶: Node id to the number of samples.