concatenate_divisions¶
- concatenate_divisions(partitioner: Partitioner, partition_division: list[float] | tuple[float, ...] | dict[str, float], division_id: int | str) Dataset [source]¶
Create a dataset by concatenation of divisions from all partitions.
The divisions are created based on the partition_division and accessed based on the division_id. This fuction can be used to create e.g. centralized dataset from federated on-edge test sets.
- Parameters:
partitioner (Partitioner) – Partitioner object with assigned dataset.
partition_division (Union[List[float], Tuple[float, ...], Dict[str, float]]) – Fractions specifying the division of the partitions of a partitioner. You can think of this as on-edge division of the data into multiple divisions (e.g. into train and validation). E.g. [0.8, 0.2] or {“partition_train”: 0.8, “partition_test”: 0.2}.
division_id (Union[int, str]) – The way to access the division (from a List or DatasetDict). If your partition_division is specified as a list, then division_id represents an index to an element in that list. If partition_division is passed as a Dict, then division_id is a key of such dictionary.
- Returns:
concatenated_divisions – A dataset created as concatenation of the divisions from all partitions.
- Return type:
Dataset
Examples
Use concatenate_divisions with division specified as a list.
>>> from flwr_datasets import FederatedDataset >>> from flwr_datasets.utils import concatenate_divisions >>> >>> fds = FederatedDataset(dataset="mnist", partitioners={"train": 100}) >>> concatenated_divisions = concatenate_divisions( ... partitioner=fds.partitioners["train"], ... partition_division=[0.8, 0.2], ... division_id=1 ... ) >>> print(concatenated_divisions)
Use concatenate_divisions with division specified as a dict. This accomplishes the same goal as the example with a list above.
>>> from flwr_datasets import FederatedDataset >>> from flwr_datasets.utils import concatenate_divisions >>> >>> fds = FederatedDataset(dataset="mnist", partitioners={"train": 100}) >>> concatenated_divisions = concatenate_divisions( ... partitioner=fds["train"], ... partition_division={"train": 0.8, "test": 0.2}, ... division_id="test" ... ) >>> print(concatenated_divisions)