Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-28 Thread Scott Atchley
On Oct 28, 2010, at 2:50 PM, Ray Muno wrote: > On 10/28/2010 01:40 PM, Scott Atchley wrote: > >> >> Does your environment have LD_LIBRARY_PATH set to point to $OMPI/lib and >> $MX/lib? Does it get set on login? Is $OMPI/bin in your PATH? >> >> Scott > > $MX/lib was not in LD_LIBRARY_PATH > >

Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-28 Thread Ray Muno
On 10/28/2010 01:40 PM, Scott Atchley wrote: > > Does your environment have LD_LIBRARY_PATH set to point to $OMPI/lib and > $MX/lib? Does it get set on login? Is $OMPI/bin in your PATH? > > Scott $MX/lib was not in LD_LIBRARY_PATH That is interesting. On the head node, [/etc/ld.so.conf.d]$

Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-28 Thread Scott Atchley
On Oct 28, 2010, at 2:18 PM, Ray Muno wrote: > On 10/22/2010 07:36 AM, Scott Atchley wrote: >> Ray, >> >> Looking back at your original message, you say that it works if you use the >> Myricom supplied mpirun from the Myrinet roll. I wonder if this is a >> mismatch between libraries on the comp

Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-28 Thread Ray Muno
On 10/22/2010 07:36 AM, Scott Atchley wrote: > Ray, > > Looking back at your original message, you say that it works if you use the > Myricom supplied mpirun from the Myrinet roll. I wonder if this is a mismatch > between libraries on the compute nodes. > > What do you get if you use your OMPI'

Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-22 Thread Scott Atchley
Ray, Looking back at your original message, you say that it works if you use the Myricom supplied mpirun from the Myrinet roll. I wonder if this is a mismatch between libraries on the compute nodes. What do you get if you use your OMPI's mpirun with: $ mpirun -n 1 -H ldd $PWD/ I am wondering

Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-22 Thread Scott Atchley
On Oct 20, 2010, at 9:43 PM, Raymond Muno wrote: > On 10/20/2010 8:30 PM, Scott Atchley wrote >> Are you building OMPI with support for both MX and IB? If not and you only >> want MX support, try configuring OMPI using --disable-memory-manager (check >> configure for the exact option). >> >> We

Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-21 Thread Raymond Muno
On 10/20/2010 8:30 PM, Scott Atchley wrote: We have fixed this bug in the most recent 1.4.x and 1.5.x releases. Scott OK, a few more tests. I was using PGI 10.4 as the compiler. I have now tried OpenMPI 1.4.3 with PGI 10.8 and Intel 11.1. I get the same results in each case, mpirun seg faul

Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-20 Thread Raymond Muno
On 10/20/2010 8:30 PM, Scott Atchley wrote Are you building OMPI with support for both MX and IB? If not and you only want MX support, try configuring OMPI using --disable-memory-manager (check configure for the exact option). We have fixed this bug in the most recent 1.4.x and 1.5.x releases

Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-20 Thread Raymond Muno
On 10/20/2010 8:30 PM, Scott Atchley wrote: On Oct 20, 2010, at 9:22 PM, Raymond Muno wrote: On 10/20/2010 7:59 PM, Ralph Castain wrote: The error message seems to imply that mpirun itself didn't segfault, but that something else did. Is that segfault pid from mpirun? This kind of problem u

Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-20 Thread Scott Atchley
On Oct 20, 2010, at 9:22 PM, Raymond Muno wrote: > On 10/20/2010 7:59 PM, Ralph Castain wrote: >> The error message seems to imply that mpirun itself didn't segfault, but >> that something else did. Is that segfault pid from mpirun? >> >> This kind of problem usually is caused by mismatched buil

Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-20 Thread Raymond Muno
On 10/20/2010 7:59 PM, Ralph Castain wrote: The error message seems to imply that mpirun itself didn't segfault, but that something else did. Is that segfault pid from mpirun? This kind of problem usually is caused by mismatched builds - i.e., you compile against your new build, but you pick

Re: [OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-20 Thread Ralph Castain
The error message seems to imply that mpirun itself didn't segfault, but that something else did. Is that segfault pid from mpirun? This kind of problem usually is caused by mismatched builds - i.e., you compile against your new build, but you pick up the Myrinet build when you try to run becau

[OMPI users] OpenMPI 1.4.2 with Myrinet MX, mpirun seg faults

2010-10-20 Thread Raymond Muno
We are doing a test build of a new cluster. We are re-using our Myrinet 10G gear from a previous cluster. I have built OpenMPI 1.4.2 with PGI 10.4. We use this regularly on our Infiniband based cluster and all the install elements were readily available. With a few go-arounds with the My