[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-08-13 Thread Greg Brockman
Greg Brockman added the comment: I'll take another stab at this. In the attachment (assign-tasks.patch), I've combined a lot of the ideas presented on this issue, so thank you both for your input. Anyway: - The basic idea of the patch is to record the mapping of tasks to worke

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-08-20 Thread Greg Brockman
Greg Brockman added the comment: Thanks for looking at it! Basically this patch requires the parent process to be able to send a message to a particular worker. As far as I can tell, the existing queues allow the children to send a message to the parent, or the parent to send a message to

[issue8296] multiprocessing.Pool hangs when issuing KeyboardInterrupt

2010-08-26 Thread Greg Brockman
Changes by Greg Brockman : -- nosy: +gdb ___ Python tracker <http://bugs.python.org/issue8296> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pyth

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-08-27 Thread Greg Brockman
Greg Brockman added the comment: Hmm, a few notes. I have a bunch of nitpicks, but those can wait for a later iteration. (Just one style nit: I noticed a few unneeded whitespace changes... please try not to do that, as it makes the patch harder to read.) - Am I correct that you handle a

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-08-27 Thread Greg Brockman
Greg Brockman added the comment: Ah, you're right--sorry, I had misread your code. I hadn't noticed the usage of the worker_pids. This explains what you're doing with the ACKs. Now, the problem is, I think doing it this way introduces some races (which is why I introduced t

[issue4106] multiprocessing occasionally spits out exception during shutdown

2010-07-08 Thread Greg Brockman
Greg Brockman added the comment: For what it's worth, I think I have a simpler reproducer of this issue. Using freshly-compiled python-from-trunk (as well as multiprocessing-from-trunk), I get tracebacks from the following about 30% of the time: """ import m

[issue4106] multiprocessing occasionally spits out exception during shutdown

2010-07-08 Thread Greg Brockman
Greg Brockman added the comment: I'm on Ubuntu 10.04, 64 bit. -- ___ Python tracker <http://bugs.python.org/issue4106> ___ ___ Python-bugs-list mailing list

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-08 Thread Greg Brockman
New submission from Greg Brockman : I have recently begun using multiprocessing for a variety of batch jobs. It's a great library, and it's been quite useful. However, I have been bitten several times by situations where a worker process in a Pool will unexpectedly di

[issue9207] multiprocessing occasionally spits out exception during shutdown

2010-07-08 Thread Greg Brockman
New submission from Greg Brockman : On Ubuntu 10.04, using freshly-compiled python-from-trunk (as well as multiprocessing-from-trunk), I get tracebacks from the following about 30% of the time: """ import multiprocessing, time def foo(x):

[issue4106] multiprocessing occasionally spits out exception during shutdown

2010-07-08 Thread Greg Brockman
Greg Brockman added the comment: Sure thing. See http://bugs.python.org/issue9207. -- ___ Python tracker <http://bugs.python.org/issue4106> ___ ___ Python-bug

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-08 Thread Greg Brockman
Greg Brockman added the comment: That's likely a mistake on my part. I'm not observing this using the stock version of multiprocessing on my Ubuntu machine(after running O(100) times). I do, however, observe it when using either python2.7 or python2.6 with multiprocessing-from

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-08 Thread Greg Brockman
Greg Brockman added the comment: No, I'm not using the Google code backport. To be clear, I've tried testing this with two versions of multiprocessing: - multiprocessing-from-trunk (r82645): I get these exceptions with ~40% frequency - multiprocessing from Ubuntu 10.04 (version 0

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-08 Thread Greg Brockman
Greg Brockman added the comment: > Wait - so, you are pulling svn trunk, compiling and running your test > with the built python executable? Yes. I initially observed this issue while using 10.04's Python (2.6.5), but wanted to make sure it wasn't fixed by using a newer int

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-08 Thread Greg Brockman
Greg Brockman added the comment: Yeah, I've just taken a checkout from trunk, ran './configure && make && make install', and reproduced on: - Ubuntu 10.04 32-bit - Ubuntu 9.04 32-bit -- ___ Python tracker &l

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-08 Thread Greg Brockman
Greg Brockman added the comment: With the line commented out, I no longer see any exceptions. Although, if I understand what's going on, there still a (much rarer) possibility of an exception, right? I guess in the common case, the worker_handler is in the sleep when shutdown begins.

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-09 Thread Greg Brockman
Greg Brockman added the comment: Think http://www.mail-archive.com/python-l...@python.org/msg282114.html is relevant? -- ___ Python tracker <http://bugs.python.org/issue9

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-10 Thread Greg Brockman
Greg Brockman added the comment: Cool, thanks. I'll note that with this patch applied, using the test program from 9207 I consistently get the following exception: """ Exception in thread Thread-1 (most likely raised during interpreter shutdown): Traceback (most recen

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-10 Thread Greg Brockman
Greg Brockman added the comment: What about just catching the exception? See e.g. the attached patch. (Disclaimer: not heavily tested). -- Added file: http://bugs.python.org/file17934/shutdown.patch ___ Python tracker <http://bugs.python.

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-12 Thread Greg Brockman
Greg Brockman added the comment: With pool.py:272 commented out, running about 50k iterations, I saw 4 tracebacks giving an exception on pool.py:152. So this seems to imply the race does exist (i.e. that the thread is in _maintain_pool rather than time.sleep when shutdown begins). It looks

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-12 Thread Greg Brockman
Greg Brockman added the comment: Thanks much for taking a look at this! > why are you terminating the second pass after finding a failed > process? Unfortunately, if you've lost a worker, you are no longer guaranteed that cache will eventually be empty. In particular, you may

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-12 Thread Greg Brockman
Greg Brockman added the comment: > For processes disappearing (if that can at all happen), we could solve > that by storing the jobs a process has accepted (started working on), > so if a worker process is lost, we can mark them as failed too. Sure, this would be reasonable behavio

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-13 Thread Greg Brockman
Greg Brockman added the comment: > What kind of errors are you having that makes the get() call fail? Try running the script I posted. It will fail with an AttributeError (raised during unpickling) and hang. I'll note that the particular issues that I've run into in practice ar

[issue9244] multiprocessing.pool: Worker crashes if result can't be encoded

2010-07-13 Thread Greg Brockman
Greg Brockman added the comment: This looks pretty reasonable to my untrained eye. I successfully applied and ran the test suite. To be clear, the errback change and the unpickleable result change are actually orthogonal, right? Anyway, I'm not really familiar with the protocol here

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-13 Thread Greg Brockman
Greg Brockman added the comment: While looking at your patch in issue 9244, I realized that my code fails to handle an unpickleable task, as in: """ #!/usr/bin/env python import multiprocessing foo = lambda x: x p = multiprocessing.Pool(1) p.apply(foo, [1]) ""&quo

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-14 Thread Greg Brockman
Greg Brockman added the comment: Before I forget, looks like we also need to deal with the result from a worker being un-unpickleable: """ #!/usr/bin/env python import multiprocessing def foo(x): global bar def bar(x): pass return bar p = multiprocessing.Pool(1)

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-15 Thread Greg Brockman
Greg Brockman added the comment: >> Before I forget, looks like we also need to deal with the >> result from a worker being un-unpickleable: >This is what my patch in bug 9244 does... Really? I could be misremembering, but I believe you deal with the case of the result bei

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-15 Thread Greg Brockman
Greg Brockman added the comment: Actually, the program you demonstrate is nonequivalent to the one I posted. The one I posted pickles just fine because 'bar' is a global name, but doesn't unpickle because it doesn't exist in the parent's namespace. (See http:

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-15 Thread Greg Brockman
Greg Brockman added the comment: Started looking at your patch. It seems to behave reasonably, although it still doesn't catch all of the failure cases. In particular, as you note, crashed jobs won't be noticed until the pool shuts down... but if you make a blocking call such

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-20 Thread Greg Brockman
Greg Brockman added the comment: At first glance, looks like there are a number of sites where you don't change the blocking calls to non-blocking calls (e.g. get()). Almost all of the get()s have the potential to be called when there is no possibility for them to terminate. I

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-21 Thread Greg Brockman
Greg Brockman added the comment: > I thought the EOF errors would take care of that, at least this has > been running in production on many platforms without that happening. There are a lot of corner cases here, some more pedantic than others. For example, suppose a child dies while h

[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2010-07-22 Thread Greg Brockman
Changes by Greg Brockman : -- nosy: +gdb ___ Python tracker <http://bugs.python.org/issue9334> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pyth

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-27 Thread Greg Brockman
Greg Brockman added the comment: > You can't have a sensible default timeout, because the worker may be > processing something important... In my case, the jobs are either functional or idempotent anyway, so aborting halfway through isn't a problem. In general though, I'

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-27 Thread Greg Brockman
Greg Brockman added the comment: Thanks for the comment. It's good to know what constraints we have to deal with. > we can not, however, change the API. Does this include adding optional arguments? -- ___ Python tracker <http://bugs

[issue9535] Pending signals are inherited by child processes

2010-08-06 Thread Greg Brockman
New submission from Greg Brockman : Upon os.fork(), pending signals are inherited by the child process. This can be demonstrated by pressing C-c in the middle of the following program: """ import os, sys, time, threading def do_fork(): while True: