I am running into a problem with a simple program (which performs several MPI_Bcast operations) hanging. Most processes hang in MPI_Finalize, the others hang in MPI_Bcast. Interestingly enough, this only happens when I oversubscribe the nodes. For instance, using IU's Odin cluster, I take 4 nodes (each has two Opteron processors) and run 9 processes:

        mpirun -np 9 ./a.out

The backtrace from 7/9 of the processes shows that they're in MPI_Finalize:

#0  0x0000003d1b92e813 in sigprocmask () from /lib64/tls/libc.so.6
#1  0x0000002a9598f55f in poll_dispatch ()
   from /san/mpi/openmpi-1.1-gcc/lib/libopal.so.0
#2  0x0000002a9598e3f3 in opal_event_loop ()
   from /san/mpi/openmpi-1.1-gcc/lib/libopal.so.0
#3  0x0000002a960487c4 in mca_oob_tcp_msg_wait ()
   from /san/mpi/openmpi-1.1-gcc/lib/openmpi/mca_oob_tcp.so
#4  0x0000002a9604ca13 in mca_oob_tcp_recv ()
   from /san/mpi/openmpi-1.1-gcc/lib/openmpi/mca_oob_tcp.so
#5  0x0000002a9585d833 in mca_oob_recv_packed ()
   from /san/mpi/openmpi-1.1-gcc/lib/liborte.so.0
#6  0x0000002a9585dd37 in mca_oob_xcast ()
   from /san/mpi/openmpi-1.1-gcc/lib/liborte.so.0
#7  0x0000002a956cbfb0 in ompi_mpi_finalize ()
   from /san/mpi/openmpi-1.1-gcc/lib/libmpi.so.0
#8  0x000000000040bd3e in main ()

The other two processes are in MPI_Bcast:

#0  0x0000002a97c2cbe3 in mca_btl_mvapi_component_progress ()
   from /san/mpi/openmpi-1.1-gcc/lib/openmpi/mca_btl_mvapi.so
#1  0x0000002a97b21072 in mca_bml_r2_progress ()
   from /san/mpi/openmpi-1.1-gcc/lib/openmpi/mca_bml_r2.so
#2  0x0000002a95988a4a in opal_progress ()
   from /san/mpi/openmpi-1.1-gcc/lib/libopal.so.0
#3  0x0000002a97a13fe7 in mca_pml_ob1_recv ()
   from /san/mpi/openmpi-1.1-gcc/lib/openmpi/mca_pml_ob1.so
#4  0x0000002a9846d0aa in ompi_coll_tuned_bcast_intra_chain ()
   from /san/mpi/openmpi-1.1-gcc/lib/openmpi/mca_coll_tuned.so
#5  0x0000002a9846d100 in ompi_coll_tuned_bcast_intra_pipeline ()
   from /san/mpi/openmpi-1.1-gcc/lib/openmpi/mca_coll_tuned.so
#6  0x0000002a9846a3d7 in ompi_coll_tuned_bcast_intra_dec_fixed ()
   from /san/mpi/openmpi-1.1-gcc/lib/openmpi/mca_coll_tuned.so
#7  0x0000002a956deae3 in PMPI_Bcast ()
   from /san/mpi/openmpi-1.1-gcc/lib/libmpi.so.0
#8  0x000000000040bcc7 in main ()

Other random information:
- The two processes stuck in MPI_Bcast are not on the same node. This has been the case both times I've gone through the backtraces, but I can't conclude that it's a necessary condition. - If I force the use of the "basic" MCA for collectives, this problem does not occur.
        - If I don't oversubscribe the nodes, things seem to work properly.
        - The C++ program source and result of ompi_info are attached

This should be easy to reproduce for anyone with access to Odin. I'm using Open MPI 1.1 configured with no special options. It is available as the module "mpi/openmpi-1.1-gcc" on the cluster. I'm using SLURM interactively to allocate the nodes before executing mpirun:

        srun -A -N 4

        Cheers,
        Doug Gregor

Attachment: broadcast_skeleton_content.cpp
Description: Binary data

Attachment: ompi_info.log
Description: Binary data

Reply via email to