I have an MPI program that is fairly straight forward, essentially "initialize,
2 sends from master to slaves, 2 receives on slaves, do a bunch of system calls
for copying/pasting then running a serial code on each mpi task, tidy up and
mpi finalize".
This seems straightforward, but I'm not getting mpi_finalize to work correctly.
Below is a snapshot of the program, without all the system copy/paste/call
external code which I've rolled up in "do codish stuff" type statements.
program mpi_finalize_break
!<variable declarations>
call MPI_INIT(ierr)
icomm = MPI_COMM_WORLD
call MPI_COMM_SIZE(icomm,nproc,ierr)
call MPI_COMM_RANK(icomm,rank,ierr)
!<do codish stuff for a while>
if (rank == 0) then
!<set up some stuff then call MPI_SEND in a loop over number of slaves>
call MPI_SEND(numat,1,MPI_INTEGER,n,0,icomm,ierr)
call MPI_SEND(n_to_add,1,MPI_INTEGER,n,0,icomm,ierr)
else
call MPI_Recv(begin_mat,1,MPI_INTEGER,0,0,icomm,status,ierr)
call MPI_Recv(nrepeat,1,MPI_INTEGER,0,0,icomm,status,ierr)
!<do codish stuff for a while>
endif
print*, "got here4", rank
call MPI_BARRIER(icomm,ierr)
print*, "got here5", rank, ierr
call MPI_FINALIZE(ierr)
print*, "got here6"
end program mpi_finalize_break
Now the problem I am seeing occurs around the "got here4", "got here5" and "got
here6" statements. I get the appropriate number of print statements with
corresponding ranks for "got here4", as well as "got here5". Meaning, the
master and all the slaves (rank 0, and all other ranks) got to the barrier
call, through the barrier call, and to MPI_FINALIZE, reporting 0 for ierr on
all of them. However, when it gets to "got here6", after the MPI_FINALIZE I'll
get all kinds of weird behavior. Sometimes I'll get one less "got here6" than I
expect, or sometimes I'll get eight less (it varies), however the program hangs
forever, never closing and leaves an orphaned process on one (or more) of the
compute nodes.
I am running this on an infiniband backbone machine, with the NFS server shared
over infiniband (nfs-rdma). I'm trying to determine how the MPI_BARRIER call
works fine, yet MPI_FINALIZE ends up with random orphaned runs (not the same
node, nor the same number of orphans every time). I'm guessing it is related to
the various system calls to cp, mv, ./run_some_code, cp, mv but wasn't sure if
it may be related to the speed of infiniband too, as all this happens fairly
quickly. I could have wrong intuition as well. Anybody have thoughts? I could
put the whole code if helpful, but this condensed version I believe captures
it. I'm running openmpi1.8.4 compiled against ifort 15.0.2 , with Mellanox
adapters running firmware 2.9.1000. This is the mellanox firmware available
through yum with centos 6.5, 2.6.32-504.8.1.el6.x86_64.
ib0 Link encap:InfiniBand HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:192.168.6.254 Bcast:192.168.6.255 Mask:255.255.255.0
inet6 addr: fe80::202:c903:57:e7fd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
RX packets:10952 errors:0 dropped:0 overruns:0 frame:0
TX packets:9805 errors:0 dropped:625413 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:830040 (810.5 KiB) TX bytes:643212 (628.1 KiB)
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.9.1000
node_guid: 0002:c903:0057:e7fc
sys_image_guid: 0002:c903:0057:e7ff
vendor_id: 0x02c9
vendor_part_id: 26428
hw_ver: 0xB0
board_id: MT_0D90110009
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 1
port_lid: 2
port_lmc: 0x00
link_layer: InfiniBand
This problem only occurs in this simple implementation, thus my thinking it is
tied to the system calls. I run several other, much larger, much more robust
MPI codes without issue on the machine. Thanks for the help.
--Jack