An optional dict mapping a name to a (de)serializer. See `dask_serialize` and `dask_deserialize` for more.
serialize
This also remove large bytestrings from the message into a second dictionary.
>>> from distributed.protocol import to_serialize >>> msg = {'op': 'update', 'data': to_serialize(123)} >>> extract_serialize(msg) ({'op': 'update'}, {('data',): <Serialize: 123>}, set())
>>> msg = {'op': 'update', 'data': to_serialize(123)} >>> nested_deserialize(msg) {'op': 'update', 'data': 123}
Normally when registering new classes for Dask's custom serialization you need to manage headers and frames, which can be tedious. If all you want to do is traverse through your object and apply serialize to all of your object's attributes then this function may provide an easier path. This registers a class for the custom Dask serialization family. It serializes it by traversing through its __dict__ of attributes and applying ``serialize`` and ``deserialize`` recursively. It collects a set of frames and keeps small attributes in the header. Deserialization reverses this process. This is a good idea if the following hold: 1. Most of the bytes of your object are composed of data types that Dask's custom serializtion already handles well, like Numpy arrays. 2. Your object doesn't require any special constructor logic, other than object.__new__(cls)
>>> import sklearn.base >>> from distributed.protocol import register_generic >>> register_generic(sklearn.base.BaseEstimator)
dask_serialize dask_deserialize
>>> class Human: ... def __init__(self, name): ... self.name = name >>> def serialize(human): ... header = {} ... frames = [human.name.encode()] ... return header, frames >>> def deserialize(header, frames): ... return Human(frames[0].decode()) >>> register_serialization(Human, serialize, deserialize) >>> serialize(Human('Alice')) ({}, [b'Alice'])
serialize deserialize
This takes in an arbitrary Python object and returns a msgpack serializable header and a list of bytes or memoryview objects. The serialization protocols to use are configurable: a list of names define the set of serializers to use, in order. These names are keys in the ``serializer_registry`` dict (e.g., 'pickle', 'msgpack'), which maps to the de/serialize functions. The name 'dask' is special, and will use the per-class serialization methods. ``None`` gives the default list ``['dask', 'pickle']``.
>>> serialize(1) ({}, [b'\x80\x04\x95\x03\x00\x00\x00\x00\x00\x00\x00K\x01.']) >>> serialize(b'123') # some special types get custom treatment ({'type': 'builtins.bytes'}, [b'123']) >>> deserialize(*serialize(1)) 1
deserialize: Convert header and frames back to object to_serialize: Mark that data in a message should be serialized register_serialization: Register custom serialization functions