On 11:19 am, ita...@itamarst.org wrote:
On 10/23/2013 12:50 PM, Phil Mayers wrote:
This is a multiprocessing bug IMHO.
This issue with multiprocessing appears in other places too. E.g. if
you're using stdlib logging, child processes will try to rotate the
parent process logs.
Basically multiprocessing on Unix is utterly broken and should never be
used (except in the fork+exec form in Python 3.4).
To expand on that just a bit, the form of sharing that you get when you
fork() but you don't exec() is very difficult to use correctly (I think
it's an open question whether it's *possible* to use correctly in a
Python program).
The argument here is similar to the argument against shared-everything
multithreading. While memory (and some other per-process state) is no
longer shared after fork(), *some* per-process state is still shared.
And all of the state that isn't shared is still a potential source of
bugs since it's almost certainly the case that none of it cooperated
with the fork() call - a call which happened at some arbitrary time and
captured a snapshot of all the state in memory at an arbitrary point.
Consider a simple implementation of a lock file, used to prevent
multiple instances of a program from starting. There are several ways
fork() could break such code. Perhaps it is partway through acquiring a
lock on the lock file when the fork() occurs. Perhaps the result is
that the file ends up locked but no process thinks it is holding the
lock. Now no instances of the program are running. Or perhaps the lock
is held when fork() happens and the problem only surfaces at unlock
time. Perhaps one of the processes exits and releases the lock. Now
the program is still running but the lock isn't held.
And that's just one of the simplest possible examples of how things can
go wrong.
The nearly uncountable different ways for failures to creep in and the
resulting impracticality (if not impossibility) of being able to test
that Twisted (or any Python library) actually works when fork() is used
means that it's not likely Twisted will ever be declared compatible with
any fork()-without-exec() usage.
You can find some examples of Twisted-using applications that run
multiple processes, though. Apple CalendarServer does it by passing
file descriptors to worker processes and sends them the location of a
configuration file describing how they should behave. Divmod Mantissa
does it by inserting self-describing work into a SQLite3 database. When
the worker process finds one of these, it knows what code to load and
run by looking at the fields in the row. These are variations on a
theme - RPC, not shared (or duplicated) memory.
Hope this helps,
Jean-Paul
_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python