divide_dataset¶
- divide_dataset(dataset: Dataset, division: list[float] | tuple[float, ...] | dict[str, float]) list[Dataset] | DatasetDict [source]¶
Divide the dataset according to the division.
The division support varying number of splits, which you can name. The splits are created from the beginning of the dataset.
- Parameters:
dataset (Dataset) – Dataset to be divided.
division (Union[List[float], Tuple[float, ...], Dict[str, float]]) – Configuration specifying how the dataset is divided. Each fraction has to be >0 and <=1. They have to sum up to at most 1 (smaller sum is possible).
- Returns:
divided_dataset – If division is List or Tuple then List[Dataset] is returned else if division is Dict then DatasetDict is returned.
- Return type:
Union[List[Dataset], DatasetDict]
Examples
Use divide_dataset with division specified as a list.
>>> from flwr_datasets import FederatedDataset >>> from flwr_datasets.utils import divide_dataset >>> >>> fds = FederatedDataset(dataset="mnist", partitioners={"train": 100}) >>> partition = fds.load_partition(0) >>> division = [0.8, 0.2] >>> train, test = divide_dataset(dataset=partition, division=division)
Use divide_dataset with division specified as a dict (this accomplishes the same goal as the example with a list above).
>>> from flwr_datasets import FederatedDataset >>> from flwr_datasets.utils import divide_dataset >>> >>> fds = FederatedDataset(dataset="mnist", partitioners={"train": 100}) >>> partition = fds.load_partition(0) >>> division = {"train": 0.8, "test": 0.2} >>> train_test = divide_dataset(dataset=partition, division=division) >>> train, test = train_test["train"], train_test["test"]