[ANN]: Python module to distribute computations for parallel execution

Giridhar Pemmasani Wed, 15 Feb 2012 06:09:46 -0800

Hello,

I would like to announce dispy (http://dispy.sourceforge.net), a
python framework for distributing computations for parallel execution
to processors/cores on single node to many nodes over the network. The
computations can be python functions or programs. If there are any
dependencies, such as other python functions, modules, classes,
objects or files, they are also distributed as well. The results of
each computation, output, error messages and exception trace, if any,
are made available to client program for further processing. Popular
map/reduce style programs can be easily developed and deployed with
dispy.


There is also an implementation of dispy, called discopy, that uses
asynchronous I/O and coroutines, so that discopy will scale
efficiently for large number of network connections (right now this is
a bit academic, until it has been tested with such setups). The
framework with asynchronous I/O and coroutines, called asyncoro, is
independent of dispy - discopy is an implementation of dispy using
asyncoro. Others may find asyncoro itself useful.

Salient features of dispy/discopy are:

  * Computations (python functions or standalone programs) and its
    dependencies (files, python functions, classes, modules) are
    distributed automatically.

  * Computation nodes can be anywhere on the network (local or
    remote). For security, either simple hash based authentication or
    SSL encryption can be used.

  * A computation may specify which nodes are allowed to execute it
    (for now, using simple patterns of IP addresses).

  * After each execution is finished, the results of execution,
    output, errors and exception trace are made available for further
    processing.

  * If callback function is provided, dispy executes that function
    when a job is finished; this feature is useful for further
    processing of job results.


  * Nodes may become available dynamically: dispy will schedule jobs
    whenever a node is available and computations can use that node.

  * Client-side and server-side fault recovery are supported:

    If user program (client) terminates unexpectedly (e.g., due to
    uncaught exception), the nodes continue to execute scheduled jobs.
    If client-side fault recover option is used when creating a cluster,
    the results of the scheduled (but unfinished at the time of crash)
    jobs for that cluster can be easily retrieved later.

    If a computation is marked re-entrant (with 'resubmit=True' option)
    when a cluster is created and a node (server) executing jobs for
    that computation fails, dispy automatically resubmits those jobs
    to other available nodes.

  * In optimization problems it is useful for computations to send
    (successive) provisional results back to the client, so it can,
    for example, terminate computations. If computations are python
    functions, they can use 'dispy_provisional_result' function for
    this purpose.

  * dispy can be used in a single process to use all the nodes
    exclusively (with JobCluster - simpler to use) or in multiple
    processes simultaneously sharing the nodes (with ShareJobCluster
    and dispyscheduler).

dispy works with python 2.7. It has been tested on Linux, Mac OS X and
known to work with Windows. discopy has been tested on Linux and Mac
OS X.

I am not subscribed to the list, so please Cc me if you have comments.

Cheers,
Giri

-- 
http://mail.python.org/mailman/listinfo/python-list

[ANN]: Python module to distribute computations for parallel execution

Reply via email to