[issue10394] subprocess Popen deadlock
New submission from Christoph Mathys : The ctor of subprocess.Popen has a race condition, which the attached program should demonstrate (on my computer a few seconds are enough). Program One sleeps for 2 seconds, Program Two exits right after execve. Now I would expect Program Two to take a very short time between Popen and the completion of wait(), but it regularly takes about 2 seconds. The problem is this: Popen._execute_child opens a pipe and sets the FD_CLOEXEC flag. If thread_1 just finished creating the pipe but could not yet set FD_CLOEXEC when thread_2 fork()s, thread_1 will lock up when it reads on the pipe (errpipe_read). The process forked by thread_1 will close the pipe, but the process forked by thread_2 will only close the pipe when it exits, blocking thread_1 inside the read function until then. I see different options: Linux has the platform specific flag O_CLOEXEC to set this flag during open() (the manpage of open says since 2.6.23, so highly platform dependent) To just solve the problem for Popens ctor it is enough to serialize all code from before pipe() until after fork(). This can still lead to problems if fork is called in other contexts than Popens ctor. A general solution would be to use a socket which can be shutdown(). If close_fds is set for Popens ctor, the problem does not occur because the extra pipe of the forked process will be closed. -- components: Library (Lib) files: deadlock.py messages: 121036 nosy: Christoph.Mathys priority: normal severity: normal status: open title: subprocess Popen deadlock type: behavior versions: Python 2.6 Added file: http://bugs.python.org/file19579/deadlock.py ___ Python tracker <http://bugs.python.org/issue10394> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10394] subprocess Popen deadlock
Christoph Mathys added the comment: Yes, it's the correct file. Sorry, I'm making quite a mess in my description about program: The "attached program" is deadlock.py. Program One and Two are python scripts executed using "python -c", the code is inside deadlock.py. I installed python 2.7 (2.7.0+) and 3.1 (3.1.2, had to fix a print statement) and could reproduce the error on both versions. Checking the code in subprocess.py confirmed that the bug is still there. However, I had to increase the number of threads (deadlock.py, line 38) to provoke the error, but I used different hardware and OS release than in the first test ((but still multi core on Linux). What do you expect on fail? I'm a noob when it comes to python, the script just prints "command took too long: ", nothing else... -- versions: +Python 2.7, Python 3.1 ___ Python tracker <http://bugs.python.org/issue10394> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com