On Aug 5, 1:21 am, sturlamolden <sturlamol...@yahoo.no> wrote: > On Aug 5, 4:37 am, erikcw <erikwickst...@gmail.com> wrote: > > > It's not always the same traceback, but they are always short like > > this. I'm running Python 2.6.2 on Ubuntu 9.04. > > > Any idea how I can debug this? > > In my experience,multiprocessingis fragile. Scripts tend fo fail for > no obvious reason, case processes to be orphaned and linger, system- > wide resource leaks, etc. For example,multiprocessinguses os._exit > to stop a spawned process, even though it inevitably results in > resource leaks on Linux (it should use sys.exit). Gaël Varoquaux and I > noticed this when we implemented shared memory ndarrays for numpy; we > consistently got memory leaks with System V IPC for no obvious reason. > Even after Jesse Noller was informed of the problem (about half a year > ago), the bug still lingers. It is easy editmultiprocessing's > forking.py file on you own, but bugs like this is a pain in the ass, > and I suspectmultiprocessinghas many of them. Of course unless you > show us you whole script, identifying the source of your bug will be > impossible. But it may very likely be inmultiprocessingas well. The > quality of this module is not impressing. I am beginning to think > thatmultiprocessingshould never have made it into the Python standard > library. The GIL cannot be that bad! If you can't stand the GIL, get a > Unix (or Mac, Linux, Cygwin) and use os.fork. Or simply switch to a > non-GIL Python: IronPython or Jython. > > Allow me to show you something better. With os.fork we can write code > like this: > > class parallel(object): > > def __enter__(self): > # call os.fork > > def __exit__(self, exc_type, exc_value, traceback): > # call sys.exit in the child processes and > # os.waitpid in the parent > > def __call__(self, iterable): > # return different sub-subsequences depending on > # child or parent status > > with parallel() as p: > # parallel block starts here > > for item in p(iterable): > # whatever > > # parallel block ends here > > This makes parallel code a lot cleaner than anything you can do > withmultiprocessing, allowing you to use constructs similar to OpenMP. > Further, if you make 'parallel' a dummy context manager, you can > develop and test the algorithms serially. The only drawback is that > you have to use Cygwin to get os.fork on Windows, and forking will be > less efficient (no copy-on-write optimization). Well, this is just one > example of why Windows sucks from the perspective of the programmer. > But it also shows that you can do much better by notusingmultiprocessingat > all. > > The only case I can think of wheremultiprocessingwould be usesful, > is I/O bound code on Windows. But here you will almost always resort > to C extension modules. For I/O bound code, Python tends to give you a > 200x speed penalty over C. If you are resorting to C anyway, you can > just use OpenMP in C for your parallel processing. We can thus forget > aboutmultiprocessinghere as well, given that we have access to the C > code. If we don't, it is still very likely that the C code releases > the GIL, and we can get away withusingPython threads instead > ofmultiprocessing. > > IMHO, if you areusingmultiprocessing, you are very likely to have a > design problem. > > Regards, > Sturla
Sturla; That bug was fixed unless I'm missing something. Also, patches and continued bug reports are welcome. jesse -- http://mail.python.org/mailman/listinfo/python-list