SizePartitioner

class SizePartitioner(partition_sizes: Sequence[int])[source]

Bases: Partitioner

Partitioner that creates each partition with the size specified by a user.

Parameters:

partition_sizes (Sequence[int]) – The size of each partition. partition_id 0 will have partition_sizes[0] samples, partition_id 1 will have partition_sizes[1] samples, etc.

Examples

>>> from flwr_datasets import FederatedDataset
>>> from flwr_datasets.partitioner import SizePartitioner
>>>
>>> partition_sizes = [15_000, 5_000, 30_000]
>>> partitioner = SizePartitioner(partition_sizes)
>>> fds = FederatedDataset(dataset="cifar10", partitioners={"train": partitioner})

Methods

is_dataset_assigned()

Check if a dataset has been assigned to the partitioner.

load_partition(partition_id)

Load a single partition of the size of partition_sizes[partition_id].

Attributes

dataset

Dataset property.

num_partitions

Total number of partitions.

partition_id_to_indices

Partition id to indices (the result of partitioning).

property dataset: Dataset

Dataset property.

is_dataset_assigned() bool

Check if a dataset has been assigned to the partitioner.

This method returns True if a dataset is already set for the partitioner, otherwise, it returns False.

Returns:

dataset_assigned – True if a dataset is assigned, otherwise False.

Return type:

bool

load_partition(partition_id: int) Dataset[source]

Load a single partition of the size of partition_sizes[partition_id].

For example if given partition_sizes=[20_000, 10_000, 30_000], then partition_id=0 will return a partition of size 20_000, partition_id=1 will return a partition of size 10_000, etc.

Parameters:

partition_id (int) – The index that corresponds to the requested partition.

Returns:

dataset_partition – Single dataset partition.

Return type:

Dataset

property num_partitions: int

Total number of partitions.

property partition_id_to_indices: dict[int, list[int]]

Partition id to indices (the result of partitioning).