Class Worker

distributed.worker.Worker

Worker node in a Dask distributed cluster

Declaration

Documentation

Workers perform two functions:

1.  **Serve data** from a local dictionary
2.  **Perform computation** on that data and on data from peers

Workers keep the scheduler informed of their data and use that scheduler to
gather data from other workers when necessary to perform a computation.

You can start a worker with the ``dask-worker`` command line application::

    $ dask-worker scheduler-ip:port

Use the ``--help`` flag to see more options::

    $ dask-worker --help

The rest of this docstring is about the internal state the the worker uses
to manage and track internal computations.

**State**

**Informational State**

These attributes don't change significantly during execution.

* **nthreads:** ``int``:
    Number of nthreads used by this worker process
* **executor:** ``concurrent.futures.ThreadPoolExecutor``:
    Executor used to perform computation
* **local_directory:** ``path``:
    Path on local machine to store temporary files
* **scheduler:** ``rpc``:
    Location of scheduler.  See ``.ip/.port`` attributes.
* **name:** ``string``:
    Alias
* **services:** ``{str: Server}``:
    Auxiliary web servers running on this worker
* **service_ports:** ``{str: port}``:
* **total_out_connections**: ``int``
    The maximum number of concurrent outgoing requests for data
* **total_in_connections**: ``int``
    The maximum number of concurrent incoming requests for data
* **total_comm_nbytes**: ``int``
* **batched_stream**: ``BatchedSend``
    A batched stream along which we communicate to the scheduler
* **log**: ``[(message)]``
    A structured and queryable log.  See ``Worker.story``

**Volatile State**

These attributes track the progress of tasks that this worker is trying to
complete.  In the descriptions below a ``key`` is the name of a task that
we want to compute and ``dep`` is the name of a piece of dependent data
that we want to collect from others.

* **tasks**: ``{key: TaskState}``
    The tasks currently executing on this worker (and any dependencies of those tasks)
* **data:** ``{key: object}``:
    Prefer using the **host** attribute instead of this, unless
    memory_limit and at least one of memory_target_fraction or
    memory_spill_fraction values are defined, in that case, this attribute
    is a zict.Buffer, from which information on LRU cache can be queried.
* **data.memory:** ``{key: object}``:
    Dictionary mapping keys to actual values stored in memory. Only
    available if condition for **data** being a zict.Buffer is met.
* **data.disk:** ``{key: object}``:
    Dictionary mapping keys to actual values stored on disk. Only
    available if condition for **data** being a zict.Buffer is met.
* **data_needed**: deque(keys)
    The keys whose data we still lack, arranged in a deque
* **ready**: [keys]
    Keys that are ready to run.  Stored in a LIFO stack
* **constrained**: [keys]
    Keys for which we have the data to run, but are waiting on abstract
    resources like GPUs.  Stored in a FIFO deque
* **executing_count**: ``int``
    A count of tasks currently executing on this worker
* **executed_count**: int
    A number of tasks that this worker has run in its lifetime
* **long_running**: {keys}
    A set of keys of tasks that are running and have started their own
    long-running clients.
* **has_what**: ``{worker: {deps}}``
    The data that we care about that we think a worker has
* **pending_data_per_worker**: ``{worker: [dep]}``
    The data on each worker that we still want, prioritized as a deque
* **in_flight_tasks**: ``int``
    A count of the number of tasks that are coming to us in current
    peer-to-peer connections
* **in_flight_workers**: ``{worker: {task}}``
    The workers from which we are currently gathering data and the
    dependencies we expect from those connections
* **comm_bytes**: ``int``
    The total number of bytes in flight
* **nbytes**: ``{key: int}``
    The size of a particular piece of data
* **threads**: ``{key: int}``
    The ID of the thread on which the task ran
* **active_threads**: ``{int: key}``
    The keys currently running on active threads
* **waiting_for_data_count**: ``int``
    A count of how many tasks are currently waiting for data

Attributes

scheduler_ip : str
scheduler_port : int
ip : str

data : MutableMapping, type, None

The object to use for storage, builds a disk-backed LRU dict by default

nthreads : int
loop : tornado.ioloop.IOLoop

local_directory : str

Directory where we place local resources

name : str

memory_limit : int, float, string

Number of bytes of memory that this worker should use.
Set to zero for no limit.  Set to 'auto' to calculate
as system.MEMORY_LIMIT * min(1, nthreads / total_cores)
Use strings or numbers like 5GB or 5e9

memory_target_fraction : float

Fraction of memory to try to stay beneath

memory_spill_fraction : float

Fraction of memory at which we start spilling to disk

memory_pause_fraction : float

Fraction of memory at which we stop running new tasks

executor : concurrent.futures.Executor

resources : dict

Resources that this worker has like ``{'GPU': 2}``

nanny : str

Address on which to contact nanny, if it exists

lifetime : str

Amount of time like "1 hour" after which we gracefully shut down the worker.
This defaults to None, meaning no explicit shutdown time.

lifetime_stagger : str

Amount of time like "5 minutes" to stagger the lifetime value
The actual lifetime will be selected uniformly at random between
lifetime +/- lifetime_stagger

lifetime_restart : bool

Whether or not to restart a worker after it has reached its lifetime
Default False

Examples

Use the command line to start a worker::

    $ dask-scheduler
    Start scheduler at 127.0.0.1:8786

    $ dask-worker 127.0.0.1:8786
    Start worker at:               127.0.0.1:1234
    Registered with scheduler at:  127.0.0.1:8786

Methods

▶ def __init__(self, scheduler_ip=None, scheduler_port=None, scheduler_file=None, ...) override
def __init__(
self,
scheduler_ip=None,
scheduler_port=None,
scheduler_file=None,
ncores=None,
nthreads=None,
loop=None,
local_dir=None,
local_directory=None,
services=None,
service_ports=None,
service_kwargs=None,
name=None,
reconnect=True,
memory_limit="auto",
executor=None,
resources=None,
silence_logs=None,
death_timeout=None,
preload=None,
preload_argv=None,
security=None,
contact_address=None,
memory_monitor_interval="200ms",
extensions=None,
metrics=DEFAULT_METRICS,
startup_information=DEFAULT_STARTUP_INFORMATION,
data=None,
interface=None,
host=None,
port=None,
protocol=None,
dashboard_address=None,
dashboard=False,
http_prefix="/",
nanny=None,
plugins=(),
low_level_profiler=dask.config.get("distributed.worker.profile.low-level"),
validate=None,
profile_cycle_interval=None,
lifetime=None,
lifetime_stagger=None,
lifetime_restart=None,
**kwargs,
)
source link
This method overrides distributed.core.Server.__init__.
▷ def __repr__(self)
source link
▷ def actor_attribute(self, comm=None, actor=None, attribute=None)
source link
▶ def add_task(self, key, function=None, args=None, kwargs=None, task=no_value, ...)
def add_task(
self,
key,
function=None,
args=None,
kwargs=None,
task=no_value,
who_has=None,
nbytes=None,
priority=None,
duration=None,
resource_restrictions=None,
actor=False,
**kwargs2,
)
source link
▷ def bad_dep(self, dep)
source link
▷ def client(self) @property
@property
def client(self)
source link
▷ def cycle_profile(self)
source link
▷ def delete_data(self, comm=None, keys=None, report=True)
source link
▷ def ensure_communicating(self)
source link
▷ def ensure_computing(self)
source link

▶ def executor_submit(self, key, function, args=(), kwargs=None, executor=None) @gen.coroutine

Safely run function in thread pool executor

@gen.coroutine
def executor_submit(

self,

key,

function,

args=(),

kwargs=None,

executor=None,

)

source link

We've run into issues running concurrent.future futures within
tornado.  Apparently it's advantageous to use timeouts and periodic
callbacks to ensure things run smoothly.  This can get tricky, so we
pull it off into an separate method.

▷ def get_call_stack(self, comm=None, keys=None)
source link

▶ def get_current_task(self)

Get the key of the task we are currently running

source link

This only makes sense to run within a task

Examples

>>> from dask.distributed import get_worker
>>> def f():
...     return get_worker().get_current_task()

>>> future = client.submit(f)  # doctest: +SKIP
>>> future.result()  # doctest: +SKIP
'f-1234'

▶ def identity(self, comm=None) override
source link
This method overrides distributed.core.Server.identity.
▷ def keys(self, comm=None)
source link
▷ def local_dir(self) @property
For API compatibility with Nanny
@property
def local_dir(self)
source link
▷ def logs(self) @property
@property
def logs(self)
source link
▷ def maybe_transition_long_running(self, ts, compute_duration=None)
source link
▷ def meets_resource_constraints(self, key)
source link
▷ def put_key_in_memory(self, ts, value, transition=True)
source link
▷ def release_key(self, key, cause=None, reason=None, report=True)
source link
▷ def rescind_key(self, key)
source link
▷ def run(self, comm, function, args=(), wait=True, kwargs=None)
source link
▷ def run_coroutine(self, comm, function, args=(), kwargs=None, wait=True)
source link
▷ def select_keys_for_gather(self, worker, dep)
source link
▷ def send_task_state_to_scheduler(self, ts)
source link
▷ def send_to_worker(self, address, msg)
source link
▶ def start_ipython(self, comm)
Start an IPython kernel
source link
```
Returns Jupyter connection info dictionary.
```
▷ def stateof(self, key)
source link
▷ def steal_request(self, key)
source link
▷ def story(self, *keys)
source link
▷ def transition(self, ts, finish, **kwargs)
source link
▷ def transition_constrained_executing(self, ts)
source link
▷ def transition_executing_done(self, ts, value=no_value, report=True)
source link
▷ def transition_executing_long_running(self, ts, compute_duration=None)
source link
▷ def transition_flight_memory(self, ts, value=None)
source link
▷ def transition_flight_waiting(self, ts, worker=None, remove=True)
source link
▷ def transition_ready_executing(self, ts)
source link
▷ def transition_ready_memory(self, ts, value=None)
source link
▷ def transition_waiting_done(self, ts, value=None)
source link
▷ def transition_waiting_flight(self, ts, worker=None)
source link
▷ def transition_waiting_ready(self, ts)
source link
▶ def trigger_profile(self)
Get a frame from all actively computing threads
source link
```
Merge these frames into existing profile counts
```
▷ def update_data(self, comm=None, data=None, report=True, serializers=None)
source link
▷ def update_who_has(self, who_has)
source link
▷ def validate_state(self)
source link
▷ def validate_task(self, ts)
source link
▷ def validate_task_executing(self, ts)
source link
▷ def validate_task_flight(self, ts)
source link
▷ def validate_task_memory(self, ts)
source link
▷ def validate_task_ready(self, ts)
source link
▷ def validate_task_waiting(self, ts)
source link
▷ def worker_address(self) @property
For API compatibility with Nanny
@property
def worker_address(self)
source link

Inherited methods

Methods inherited from distributed.node.ServerNode:
get_logs, service_ports, start_http_server, start_services, stop_services, versions
Methods inherited from distributed.core.Server:
__await__, address, close, listen_address, listener, log_event, port, start_periodic_callbacks, status, stop

Subclasses

Reexports

Imported in distributed.utils_test.
Imported in distributed.nanny.
Imported in distributed.comm.tests.test_ucx.
Imported in distributed.diagnostics.tests.test_progressbar.
Imported in distributed.diagnostics.tests.test_worker_plugin.
Imported in distributed.diagnostics.tests.test_scheduler_plugin.
Imported in distributed.deploy.local.
Imported in distributed.deploy.ssh as _Worker.
Imported in distributed.deploy.tests.test_local.
Imported in distributed.deploy.tests.test_adaptive.
Imported in distributed.tests.test_utils_test.
Imported in distributed.tests.test_versions.
Imported in distributed.tests.test_tls_functional.
Imported in distributed.tests.test_preload.
Imported in distributed.tests.test_nanny.
Imported in distributed.tests.test_scheduler.
Imported in distributed.tests.test_client.
Imported in distributed.tests.test_resources.
Imported in distributed.tests.test_worker.
Imported in distributed.tests.test_steal.
Imported in distributed.tests.test_priorities.
Imported in distributed.

Distributed

Class Worker

Declaration

Documentation

Attributes

Examples

See also

Methods

Examples

See also

Inherited methods

Subclasses

Reexports