divide_dataset¶
- divide_dataset(dataset: Dataset, division: list[float] | tuple[float, ...] | dict[str, float]) list[Dataset] | DatasetDict[source]¶
- Divide the dataset according to the division. - The division support varying number of splits, which you can name. The splits are created from the beginning of the dataset. - Parameters:
- dataset (Dataset) – Dataset to be divided. 
- division (Union[List[float], Tuple[float, ...], Dict[str, float]]) – Configuration specifying how the dataset is divided. Each fraction has to be >0 and <=1. They have to sum up to at most 1 (smaller sum is possible). 
 
- Returns:
- divided_dataset – If division is List or Tuple then List[Dataset] is returned else if division is Dict then DatasetDict is returned. 
- Return type:
- Union[List[Dataset], DatasetDict] 
 - Examples - Use divide_dataset with division specified as a list. - >>> from flwr_datasets import FederatedDataset >>> from flwr_datasets.utils import divide_dataset >>> >>> fds = FederatedDataset(dataset="mnist", partitioners={"train": 100}) >>> partition = fds.load_partition(0) >>> division = [0.8, 0.2] >>> train, test = divide_dataset(dataset=partition, division=division) - Use divide_dataset with division specified as a dict (this accomplishes the same goal as the example with a list above). - >>> from flwr_datasets import FederatedDataset >>> from flwr_datasets.utils import divide_dataset >>> >>> fds = FederatedDataset(dataset="mnist", partitioners={"train": 100}) >>> partition = fds.load_partition(0) >>> division = {"train": 0.8, "test": 0.2} >>> train_test = divide_dataset(dataset=partition, division=division) >>> train, test = train_test["train"], train_test["test"]