[issue39763] distutils.spawn should use subprocess (hang in parallel builds on QNX)

2020-03-02 Thread Elad Lahav
Elad Lahav added the comment: "Attached fork_mt.py example uses exec_fn() function... which is not defined. Is it on purpose?" That was my mistake - a copy/paste of existing code from distutils. Since the example hangs before the script gets to exec_fn() I didn't notice the

[issue39763] distutils.spawn should use subprocess (hang in parallel builds on QNX)

2020-02-29 Thread Elad Lahav
Elad Lahav added the comment: "setup.py doesn't use multiprocessing. multiprocessing is super complex. Would it be possible to write a reproducer which doesn't use multiprocessing?" But the problem is with the handling of fork() by Python modules, and specifically with

[issue39763] distutils.spawn should use subprocess (hang in parallel builds on QNX)

2020-02-27 Thread Elad Lahav
Elad Lahav added the comment: I may be missing something, but subprocess depends on _posixsubprocess, which doesn't get built until setup.py is run, which results in a circular dependency. -- ___ Python tracker <https://bugs.python.org/is

[issue39763] distutils.spawn should use subprocess (hang in parallel builds on QNX)

2020-02-27 Thread Elad Lahav
Elad Lahav added the comment: As sure as I can be given my limited experience debugging Python... Luckily I do know my way around the QNX kernel ;-) The stack trace for the child shows it stuck on a semaphore with a count value of 0. A print in the logging module shows that the child gets

[issue39763] distutils.spawn should use subprocess (hang in parallel builds on QNX)

2020-02-27 Thread Elad Lahav
Elad Lahav added the comment: OK, but that's not the problem I see. The parent calls fork(), creates a child that then runs the atfork() handlers *before* returning from the os.fork() call (which is the expected behaviour). At least one of those atfork() handlers is the one register

[issue39763] distutils.spawn should use subprocess (hang in parallel builds on QNX)

2020-02-27 Thread Elad Lahav
Elad Lahav added the comment: "When I uncomment the os.execl() line, the program runs and completes." In that case I'm not sure it is the same issue. The child processes in your case executed their part of the new_process function, which then returned. Nevertheless from the

[issue39763] distutils.spawn should use subprocess (hang in parallel builds on QNX)

2020-02-26 Thread Elad Lahav
Elad Lahav added the comment: "Your script contains a bug (there is no definition of 'exec_fn')" Told you I wasn't much of a Python developer ;-) This was just copy-pasted from spawn.py and I missed the fact that exec_fn() is not a library function. Your last comm

[issue39763] distutils.spawn should use subprocess (hang in parallel builds on QNX)

2020-02-26 Thread Elad Lahav
Elad Lahav added the comment: I'm not convinced that a multi-threaded fork()+exec() from C would be any better, unless the Python code goes to great lengths to avoid any non-async-signal-safe operations between the fork() and the exec(). So along with the proposed change to switch t

[issue39763] Hang after fork due to logging trying to reacquire the module lock in an atfork() handler

2020-02-26 Thread Elad Lahav
New submission from Elad Lahav : The attached code causes the child processes to hang on QNX. The hang is caused by the logging module trying to acquire the module lock while in an atfork() handler. In a system where semaphore state is kept in user mode and is thus inherited from the parent