ArrayRecord¶
- class ArrayRecord[source]¶
- class ArrayRecord(array_dict: OrderedDict[str, Array], *, keep_input: bool = True)
- class ArrayRecord(numpy_ndarrays: list[ndarray[Any, dtype[Any]]], *, keep_input: bool = True)
- class ArrayRecord(torch_state_dict: OrderedDict[str, torch.Tensor], *, keep_input: bool = True)
Bases:
TypedDict
[str
,Array
],InflatableObject
Array record.
A typed dictionary (
str
toArray
) that can store named arrays, including model parameters, gradients, embeddings or non-parameter arrays. Internally, this behaves similarly to anOrderedDict[str, Array]
. AnArrayRecord
can be viewed as an equivalent to PyTorch’sstate_dict
, but it holds arrays in a serialized form.This object is one of the record types supported by
RecordDict
and can therefore be stored in thecontent
of aMessage
or thestate
of aContext
.This class can be instantiated in multiple ways:
By providing nothing (empty container).
By providing a dictionary of
Array
(via thearray_dict
argument).By providing a list of NumPy
ndarray
(via thenumpy_ndarrays
argument).By providing a PyTorch
state_dict
(via thetorch_state_dict
argument).
- Parameters:
array_dict (Optional[OrderedDict[str, Array]] (default: None)) – An existing dictionary containing named
Array
instances. If provided, these entries will be used directly to populate the record.numpy_ndarrays (Optional[list[NDArray]] (default: None)) – A list of NumPy arrays. Each array will be automatically converted into an
Array
and stored in this record with generated keys.torch_state_dict (Optional[OrderedDict[str, torch.Tensor]] (default: None)) – A PyTorch
state_dict
(str
keys totorch.Tensor
values). Each tensor will be converted into anArray
and stored in this record.keep_input (bool (default: True)) – If
False
, entries from the input are removed after being added to this record to free up memory. IfTrue
, the input remains unchanged. Regardless of this value, no duplicate memory is used if the input is a dictionary ofArray
, i.e.,array_dict
.
Examples
Initializing an empty ArrayRecord:
record = ArrayRecord()
Initializing with a dictionary of
Array
:arr = Array("float32", [5, 5], "numpy.ndarray", b"serialized_data...") record = ArrayRecord({"weight": arr})
Initializing with a list of NumPy arrays:
import numpy as np arr1 = np.random.randn(3, 3) arr2 = np.random.randn(2, 2) record = ArrayRecord([arr1, arr2])
Initializing with a PyTorch model state_dict:
import torch.nn as nn model = nn.Linear(10, 5) record = ArrayRecord(model.state_dict())
Initializing with a TensorFlow model weights (a list of NumPy arrays):
import tensorflow as tf model = tf.keras.Sequential([tf.keras.layers.Dense(5, input_shape=(10,))]) record = ArrayRecord(model.get_weights())
Methods
clear
()Return number of Bytes stored in this object.
deflate
()Deflate the ArrayRecord.
from_array_dict
(array_dict, *[, keep_input])Create ArrayRecord from a dictionary of
Array
.from_numpy_ndarrays
(ndarrays, *[, keep_input])Create ArrayRecord from a list of NumPy
ndarray
.from_torch_state_dict
(state_dict, *[, ...])Create ArrayRecord from PyTorch
state_dict
.get
(k[,d])inflate
(object_content[, children])Inflate an ArrayRecord from bytes.
items
()keys
()pop
(k[,d])If key is not found, d is returned if given, otherwise KeyError is raised.
popitem
()as a 2-tuple; but raise KeyError if D is empty.
setdefault
(k[,d])to_numpy_ndarrays
(*[, keep_input])Return the ArrayRecord as a list of NumPy
ndarray
.to_torch_state_dict
(*[, keep_input])Return the ArrayRecord as a PyTorch
state_dict
.update
([E, ]**F)If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v
values
()Attributes
Return a dictionary of Arrays with their Object IDs as keys.
Check if the object is dirty after the last deflation.
Get object ID.
- property children: dict[str, InflatableObject]¶
Return a dictionary of Arrays with their Object IDs as keys.
- clear() None. Remove all items from D. ¶
- count_bytes() int [source]¶
Return number of Bytes stored in this object.
Note that a small amount of Bytes might also be included in this counting that correspond to metadata of the serialized object (e.g. of NumPy array) needed for deseralization.
- classmethod from_array_dict(array_dict: OrderedDict[str, Array], *, keep_input: bool = True) ArrayRecord [source]¶
Create ArrayRecord from a dictionary of
Array
.
- classmethod from_numpy_ndarrays(ndarrays: list[ndarray[Any, dtype[Any]]], *, keep_input: bool = True) ArrayRecord [source]¶
Create ArrayRecord from a list of NumPy
ndarray
.
- classmethod from_torch_state_dict(state_dict: OrderedDict[str, torch.Tensor], *, keep_input: bool = True) ArrayRecord [source]¶
Create ArrayRecord from PyTorch
state_dict
.
- get(k[, d]) D[k] if k in D, else d. d defaults to None. ¶
- classmethod inflate(object_content: bytes, children: dict[str, InflatableObject] | None = None) ArrayRecord [source]¶
Inflate an ArrayRecord from bytes.
- Parameters:
object_content (bytes) – The deflated object content of the ArrayRecord.
children (Optional[dict[str, InflatableObject]] (default: None)) – Dictionary of children InflatableObjects mapped to their Object IDs. These children enable the full inflation of the ArrayRecord.
- Returns:
The inflated ArrayRecord.
- Return type:
- property is_dirty: bool¶
Check if the object is dirty after the last deflation.
- items() a set-like object providing a view on D's items. ¶
- keys() a set-like object providing a view on D's keys. ¶
- property object_id: str¶
Get object ID.
- pop(k[, d]) v, remove specified key and return the corresponding value. ¶
If key is not found, d is returned if given, otherwise KeyError is raised.
- popitem() (k, v), remove and return some (key, value) pair ¶
as a 2-tuple; but raise KeyError if D is empty.
- setdefault(k[, d]) D.get(k,d), also set D[k]=d if k not in D ¶
- to_numpy_ndarrays(*, keep_input: bool = True) list[ndarray[Any, dtype[Any]]] [source]¶
Return the ArrayRecord as a list of NumPy
ndarray
.
- to_torch_state_dict(*, keep_input: bool = True) OrderedDict[str, torch.Tensor] [source]¶
Return the ArrayRecord as a PyTorch
state_dict
.
- update([E, ]**F) None. Update D from mapping/iterable E and F. ¶
If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v
- values() an object providing a view on D's values. ¶