In trying to build 1.2.6 with the pgi compilers it makes an MPI library that works with tcp, sm. But it segfaults on openib.

Both our intel compiler version and pgi version of 1.2.6 blow up like this when we force IB. So this is a new issue.
I have ompi 1.2.6 installed on my machines with Intel compiler (version 10.1) and Pgi compiler (version 7.1-5), both of them works with IB without any problem. BTW Mellanox provides Mellanox OFED binary distribution that include Intel and Pgi Open MPI 1.2.6 build.
You can download it from here http://www.mellanox.com/products/ofed.php


Is there a way to shut off early completion in 1.2.3?
Sure, just add "--mca |pml_ob1_use_early_completion 0" to your command line.| ||
Or the the above a known issues and i should use 1.2.7-pre or grab a 1.3 snap shot?
1.2.6 should be ok.

Regards,
Pasha



On Jul 2, 2008, at 10:42 AM, Pavel Shamis (Pasha) wrote:
May be this FAQ will help : http://www.open-mpi.org/faq/?category=openfabrics#v1.2-use-early-completion

Brock Palen wrote:
We have a code (arts) that locks up only when running on IB. Works fine on tcp and sm.

When we ran it in a debugger. It locked up on a MPI_Comm_split() That as far as I could tell was valid. Because the split was a hack they did to use MPI_File_open() on a single cpu, we reworked it to remove the split. The code then locks up again.

This time its locked up on an MPI_Allreduce() Which was really strange. When running on 8 cpus only rank 4 would get sucks. The rest of the ranks are fine and get the right value. (we are using ddt as our debugger).

Its very strange. Do you have any idea what could cause this to happen? We are using openmpi-1.2.3/1.2.6 with PGI compilers.


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Reply via email to