I have an MPI program that is fairly straight forward, essentially "initialize, 
2 sends from master to slaves, 2 receives on slaves, do a bunch of system calls 
for copying/pasting then running a serial code on each mpi task, tidy up and 
mpi finalize".
This seems straightforward, but I'm not getting mpi_finalize to work correctly. 
Below is a snapshot of the program, without all the system copy/paste/call 
external code which I've rolled up in "do codish stuff" type statements.
program mpi_finalize_break
!<variable declarations>
call MPI_INIT(ierr)
icomm = MPI_COMM_WORLD
call MPI_COMM_SIZE(icomm,nproc,ierr)
call MPI_COMM_RANK(icomm,rank,ierr)

!<do codish stuff for a while>
if (rank == 0) then
    !<set up some stuff then call MPI_SEND in a loop over number of slaves>
    call MPI_SEND(numat,1,MPI_INTEGER,n,0,icomm,ierr)
    call MPI_SEND(n_to_add,1,MPI_INTEGER,n,0,icomm,ierr)
else
    call MPI_Recv(begin_mat,1,MPI_INTEGER,0,0,icomm,status,ierr)
    call MPI_Recv(nrepeat,1,MPI_INTEGER,0,0,icomm,status,ierr)
    !<do codish stuff for a while>
endif

print*, "got here4", rank
call MPI_BARRIER(icomm,ierr)
print*, "got here5", rank, ierr
call MPI_FINALIZE(ierr)

print*, "got here6"
end program mpi_finalize_break
Now the problem I am seeing occurs around the "got here4", "got here5" and "got 
here6" statements. I get the appropriate number of print statements with 
corresponding ranks for "got here4", as well as "got here5". Meaning, the 
master and all the slaves (rank 0, and all other ranks) got to the barrier 
call, through the barrier call, and to MPI_FINALIZE, reporting 0 for ierr on 
all of them. However, when it gets to "got here6", after the MPI_FINALIZE I'll 
get all kinds of weird behavior. Sometimes I'll get one less "got here6" than I 
expect, or sometimes I'll get eight less (it varies), however the program hangs 
forever, never closing and leaves an orphaned process on one (or more) of the 
compute nodes.
I am running this on an infiniband backbone machine, with the NFS server shared 
over infiniband (nfs-rdma). I'm trying to determine how the MPI_BARRIER call 
works fine, yet MPI_FINALIZE ends up with random orphaned runs (not the same 
node, nor the same number of orphans every time). I'm guessing it is related to 
the various system calls to cp, mv, ./run_some_code, cp, mv but wasn't sure if 
it may be related to the speed of infiniband too, as all this happens fairly 
quickly. I could have wrong intuition as well. Anybody have thoughts? I could 
put the whole code if helpful, but this condensed version I believe captures 
it. I'm running openmpi1.8.4 compiled against ifort 15.0.2 , with Mellanox 
adapters running firmware 2.9.1000.  This is the mellanox firmware available 
through yum with centos 6.5, 2.6.32-504.8.1.el6.x86_64.

ib0       Link encap:InfiniBand  HWaddr 
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00

          inet addr:192.168.6.254  Bcast:192.168.6.255  Mask:255.255.255.0

          inet6 addr: fe80::202:c903:57:e7fd/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1

          RX packets:10952 errors:0 dropped:0 overruns:0 frame:0

          TX packets:9805 errors:0 dropped:625413 overruns:0 carrier:0

          collisions:0 txqueuelen:256

          RX bytes:830040 (810.5 KiB)  TX bytes:643212 (628.1 KiB)



hca_id: mlx4_0

        transport:                      InfiniBand (0)

        fw_ver:                         2.9.1000

        node_guid:                      0002:c903:0057:e7fc

        sys_image_guid:                 0002:c903:0057:e7ff

        vendor_id:                      0x02c9

        vendor_part_id:                 26428

        hw_ver:                         0xB0

        board_id:                       MT_0D90110009

        phys_port_cnt:                  1

                port:   1

                        state:                  PORT_ACTIVE (4)

                        max_mtu:                4096 (5)

                        active_mtu:             4096 (5)

                        sm_lid:                 1

                        port_lid:               2

                        port_lmc:               0x00

                        link_layer:             InfiniBand


This problem only occurs in this simple implementation, thus my thinking it is 
tied to the system calls.  I run several other, much larger, much more robust 
MPI codes without issue on the machine.  Thanks for the help.
--Jack

Reply via email to