NaturalIdPartitioner¶
- class NaturalIdPartitioner(partition_by: str)[source]¶
Bases:
Partitioner
Partitioner for a dataset that can be divided by a column with partition ids.
- Parameters:
partition_by (str) – The name of the column that contains the unique values of partitions.
Examples
“flwrlabs/shakespeare” dataset
>>> from flwr_datasets import FederatedDataset >>> from flwr_datasets.partitioner import NaturalIdPartitioner >>> >>> partitioner = NaturalIdPartitioner(partition_by="character_id") >>> fds = FederatedDataset(dataset="flwrlabs/shakespeare", >>> partitioners={"train": partitioner}) >>> partition = fds.load_partition(0)
“sentiment140” (aka Twitter) dataset
>>> from flwr_datasets import FederatedDataset >>> from flwr_datasets.partitioner import NaturalIdPartitioner >>> >>> partitioner = NaturalIdPartitioner(partition_by="user") >>> fds = FederatedDataset(dataset="sentiment140", >>> partitioners={"train": partitioner}) >>> partition = fds.load_partition(0)
Methods
Check if a dataset has been assigned to the partitioner.
load_partition
(partition_id)Load a single partition corresponding to a single partition_id.
Attributes
Dataset property.
Total number of partitions.
Node id to corresponding natural id present.
- property dataset: Dataset¶
Dataset property.
- is_dataset_assigned() bool ¶
Check if a dataset has been assigned to the partitioner.
This method returns True if a dataset is already set for the partitioner, otherwise, it returns False.
- Returns:
dataset_assigned – True if a dataset is assigned, otherwise False.
- Return type:
bool
- load_partition(partition_id: int) Dataset [source]¶
Load a single partition corresponding to a single partition_id.
The choice of the partition is based on unique integers assigned to each natural id present in the dataset in the partition_by column.
- Parameters:
partition_id (int) – the index that corresponds to the requested partition
- Returns:
dataset_partition – single dataset partition
- Return type:
Dataset
- property num_partitions: int¶
Total number of partitions.
- property partition_id_to_natural_id: dict[int, str]¶
Node id to corresponding natural id present.
Natural ids are the unique values in partition_by column in dataset.