4. Asynchronous and Parallel dispatchingΒΆ

When there are heavy calculations which takes a significant amount of time, you want to run your model asynchronously or in parallel. Generally, this is difficult to achieve, because it requires an higher level of abstraction and a deeper knowledge of python programming and the Global Interpreter Lock (GIL). Schedula will simplify again your life. It has four default executors to dispatch asynchronously or in parallel:

  • async: execute all functions asynchronously in the same process,
  • parallel: execute all functions in parallel excluding SubDispatch functions,
  • parallel-pool: execute all functions in parallel using a process pool excluding SubDispatch functions,
  • parallel-dispatch: execute all functions in parallel including SubDispatch.

Note

Running functions asynchronously or in parallel has a cost. Schedula will spend time creating / deleting new threads / processes.

The code below shows an example of a time consuming code, that with the concurrent execution it requires at least 6 seconds to run. Note that the slow function return the process id.

>>> import schedula as sh
>>> dsp = sh.Dispatcher()
>>> def slow():
...     import os, time
...     time.sleep(1)
...     return os.getpid()
>>> for o in 'abcdef':
...     dsp.add_function(function=slow, outputs=[o])
'...'

while using the async executor, it lasts a bit more then 1 second:

>>> import time
>>> start = time.time()
>>> sol = dsp(executor='async').result()  # Asynchronous execution.
>>> (time.time() - start) < 2  # Faster then concurrent execution.
True

all functions have been executed asynchronously, but on the same process:

>>> import os
>>> pid = os.getpid()  # Current process id.
>>> {sol[k] for k in 'abcdef'} == {pid}  # Single process id.
True

if we use the parallel executor all functions are executed on different processes:

>>> sol = dsp(executor='parallel').result()  # Parallel execution.
>>> pids = {sol[k] for k in 'abcdef'}  # Process ids returned by ``slow``.
>>> len(pids) == 6  # Each function returns a different process id.
True
>>> pid not in pids  # The current process id is not in the returned pids.
True
>>> sorted(sh.shutdown_executors())
['async', 'parallel']