Hi,

I'm investigating an issue with mpirun *sometimes* hanging after programs
call MPI_Abort... all of the MPI processes have terminated, however the
mpirun is still there. This happens with 1.8.8 and 1.10.2. There look to be
two threads, one in this path:

#0  0x00007fa09c3143b3 in select () from /lib64/libc.so.6
#1  0x00007fa09b001e2c in listen_thread (obj=0x7fa09b2109e8) at
../../../../../../../../orte/mca/oob/tcp/oob_tcp_listener.c:685
#2  0x00007fa09c5ceaa1 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fa09c31b93d in clone () from /lib64/libc.so.6

and the other in this:

0  0x00007fa09c312113 in poll () from /lib64/libc.so.6
#1  0x00007fa09d318e7d in poll_dispatch (base=0x1568a80, tv=0x0) at
../../../../../../../../../opal/mca/event/libevent2021/libevent/poll.c:165
#2  0x00007fa09d30d96c in opal_libevent2021_event_base_loop (base=0x1568a80,
flags=1) at
../../../../../../../../../opal/mca/event/libevent2021/libevent/event.c:1633
#3  0x00000000004056fc in orterun (argc=2, argv=0x7ffe70248078) at
../../../../../../../orte/tools/orterun/orterun.c:1142
#4  0x0000000000403614 in main (argc=2, argv=0x7ffe70248078) at
../../../../../../../orte/tools/orterun/main.c:13

But since this is in mpirun itself, I'm not sure how to delve deeper - is
there an MCA *_base_verbose parameter (or equivalent) that works on the
mpirun?

Cheers,
Ben


Reply via email to