The attached code, is an example where openmpi/1.3.2 will lock up, if ran on 48 cores, of IB (4 cores per node), The code loops over recv from all processors on rank 0 and sends from all other ranks, as far as I know this should work, and I can't see why not. Note yes I know we can do the same thing with a gather, this is a simple case to demonstrate the issue. Note that if I increase the openib eager limit, the program runs, which normally means improper MPI, but I can't on my own figure out the problem with this code.

Any input on why code like this locks up, unless we up the eager buffer would be helpful, as we should be be having to up the buffer size, just to make code run, makes me feel hacky and dirty.

Attachment: sendbuf.f90
Description: Binary data



Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



Reply via email to