New submission from Ryan Leslie <ryle...@gmail.com>: While developing an application, an inconsistency was noted where, depending on the particular signal handler in use, multiprocessing.Queue.put() may (or may not) raise OSError() after sys.exit() was called by the handler. The following example, which was tested with Python 2.6.1 on Linux, demonstrates this.
#!/usr/bin/env python import multiprocessing import signal import sys def handleKill(signum, frame): #sys.stdout.write("Exit requested by signal.\n") print "Exit requested by signal." sys.exit(1) signal.signal(signal.SIGTERM, handleKill) queue = multiprocessing.Queue(maxsize=1) queue.put(None) queue.put(None) When the script is run, the process will block (as expected) on the second queue.put(). If (from another terminal) I send the process SIGTERM, I consistently see: $ ./q.py Exit requested by signal. $ Now, if I modify the above program by commenting out the 'print', and uncommenting the 'sys.stdout' (a very subtle change), I would expect the result to be the same when killing the process. Instead, I consistently see: $ ./q.py Exit requested by signal. Traceback (most recent call last): File "./q.py", line 15, in <module> queue.put(None) File "python2.6/multiprocessing/queues.py", line 75, in put if not self._sem.acquire(block, timeout): OSError: [Errno 0] Error $ After debugging this further, the issue appears to be in semlock_acquire() or semaphore.c in Modules/_multiprocessing: http://svn.python.org/view/python/trunk/Modules/_multiprocessing/semaphore.c?revision=71009&view=markup The relevant code from (the Unix version of) semlock_acquire() is: do { Py_BEGIN_ALLOW_THREADS if (blocking && timeout_obj == Py_None) res = sem_wait(self->handle); else if (!blocking) res = sem_trywait(self->handle); else res = sem_timedwait(self->handle, &deadline); Py_END_ALLOW_THREADS if (res == MP_EXCEPTION_HAS_BEEN_SET) break; } while (res < 0 && errno == EINTR && !PyErr_CheckSignals()); if (res < 0) { if (errno == EAGAIN || errno == ETIMEDOUT) Py_RETURN_FALSE; else if (errno == EINTR) return NULL; else return PyErr_SetFromErrno(PyExc_OSError); } In both versions of the program (print vs. sys.stdout), sem_wait() is being interrupted and is returning -1 with errno set to EINTR. This is what I expected. Also, in both cases it seems that the loop is (correctly) terminating with PyErr_CheckSignals() returning non-zero. This makes sense too; the call is executing our signal handler, and then returning -1 since our particular handler raises SystemExit. However, I suspect that depending on the exact code executed for the signal handler, errno may or may not wind up being reset in some nested call of PyErr_CheckSignals(). I believe that the error checking code below the do-while (where sem_wait() is called), needed errno to have the value set by sem_wait(), and the author wasn't expecting anything else to have changed it. In the "print" version, errno apparently winds up unchanged with EINTR, resulting in the `return NULL' statement. In the "sys.stdout" version (and probably many others), errno winds up being reset to 0, and the error handling results in the `return PyErr_SetFromErrno(PyExc_OSError)' statement. To patch this up, we can probably just save errno as, say, `wait_errno' at the end of the loop body, and then use it within the error handling block that follows. However, the rest of the code should probably be checked for this type of issue. ---------- components: Library (Lib) messages: 89804 nosy: ryles severity: normal status: open title: multiprocessing: handling of errno after signals in sem_acquire() type: behavior versions: Python 2.6, Python 2.7, Python 3.0, Python 3.1, Python 3.2 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6362> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com