Re: [OMPI users] Problems with openmpi-1.4.3

2011-03-21 Thread David Zhang
I don't know if your alias got mapped when mpicc is called. Try adding /usr/lib to LD_LIBRARY_PATH? On Sun, Mar 20, 2011 at 7:43 PM, Amos Leffler wrote: > Hi, > I have been having problems getting openmpi-1.4.3 with Linux > under SUSE 11.3. I have put the following entries in .bashrc:

Re: [OMPI users] Problems with openmpi-1.4.3

2011-03-21 Thread Gustavo Correa
Hi Amos This form perhaps? 'export PATH=/opt/openmpi/bin:$PATH' You don't want to wipe off the existing path, just add openmpi to it. Also, Intel also has its shared libraries, which may be causing trouble. My guess is that you need to set the Intel environment first by placing a line more or l

[OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
I have a question about using OpenMPI and Torque on stateless nodes. I have compiled openmpi 1.4.3 with --with-tm=/usr/local --without-slurm using intel compiler version 11.1.075. When I run a simple "hello world" mpi program, I am receiving the following error. [node164:11193] plm:tm: failed to

[OMPI users] bizarre failure with IMB/openib

2011-03-21 Thread Dave Love
I'm trying to test some new nodes with ConnectX adaptors, and failing to get (so far just) IMB to run on them. The binary runs on the same cluster using TCP, or using PSM on some other IB nodes. A rebuilt PMB and various existing binaries work with openib on the ConnectX nodes running it exactly

[OMPI users] intel compiler linking issue and issue of environment variable on remote node, with open mpi 1.4.3

2011-03-21 Thread yanyg
Hi, I am trying to compile our codes with open mpi 1.4.3, by intel compilers 8.1. (1) For open mpi 1.4.3 installation on linux beowulf cluster, I use: ./configure --prefix=/home/yiguang/dmp-setup/openmpi-1.4.3 CC=icc CXX=icpc F77=ifort FC=ifort --enable-static LDFLAGS="-i-static - static-lib

[OMPI users] 1.5.3 and SGE integration?

2011-03-21 Thread Dave Love
I've just tried 1.5.3 under SGE with tight integration, which seems to be broken. I built and ran in the same way as for 1.4.{1,3}, which works, and ompi_info reports the same gridengine parameters for 1.5 as for 1.4. The symptoms are that it reports a failure to communicate using ssh, whereas it

Re: [OMPI users] intel compiler linking issue and issue of environment variable on remote node, with open mpi 1.4.3

2011-03-21 Thread Tim Prince
On 3/21/2011 5:21 AM, ya...@adina.com wrote: I am trying to compile our codes with open mpi 1.4.3, by intel compilers 8.1. (1) For open mpi 1.4.3 installation on linux beowulf cluster, I use: ./configure --prefix=/home/yiguang/dmp-setup/openmpi-1.4.3 CC=icc CXX=icpc F77=ifort FC=ifort --enable

Re: [OMPI users] OpenMPI 1.2.x segfault as regular user

2011-03-21 Thread Prentice Bisbal
On 03/20/2011 06:22 PM, kevin.buck...@ecs.vuw.ac.nz wrote: > >> It's not hard to test whether or not SELinux is the problem. You can >> turn SELinux off on the command-line with this command: >> >> setenforce 0 >> >> Of course, you need to be root in order to do this. >> >> After turning SELinux o

Re: [OMPI users] bizarre failure with IMB/openib

2011-03-21 Thread Peter Kjellström
On Monday, March 21, 2011 12:25:37 pm Dave Love wrote: > I'm trying to test some new nodes with ConnectX adaptors, and failing to > get (so far just) IMB to run on them. ... > I'm using gcc-compiled OMPI 1.4.3 and the current RedHat 5 OFED with IMB > 3.2.2, specifying `btl openib,sm,self' (or `mtl

Re: [OMPI users] 1.5.3 and SGE integration?

2011-03-21 Thread Terry Dontje
Dave what version of Grid Engine are you using? The plm checks for the following env-var's to determine if you are running Grid Engine. SGE_ROOT ARC PE_HOSTFILE JOB_ID If these are not there during the session that mpirun is executed then it will resort to ssh. --td On 03/21/2011 08:24 AM,

Re: [OMPI users] bizarre failure with IMB/openib

2011-03-21 Thread Dave Love
Peter Kjellström writes: > Are you sure you launched it correctly and that you have (re)built OpenMPI > against your Redhat-5 ib stack? Yes. I had to rebuild because I'd omitted openib when we only needed psm. As I said, I did exactly the same thing successfully with PMB (initially because I

Re: [OMPI users] 1.5.3 and SGE integration?

2011-03-21 Thread Dave Love
Terry Dontje writes: > Dave what version of Grid Engine are you using? 6.2u5, plus irrelevant patches. It's fine with ompi 1.4. (All I did to switch was to load the 1.5.3 modules environment.) > The plm checks for the following env-var's to determine if you are > running Grid Engine. > SGE_RO

Re: [OMPI users] 1.5.3 and SGE integration?

2011-03-21 Thread Ralph Castain
Just looking at this for another question. Yes, SGE integration is broken in 1.5. Looking at how to fix now. Meantime, you can get it work by adding "-mca plm ^rshd" to your mpirun cmd line. On Mar 21, 2011, at 9:47 AM, Dave Love wrote: > Terry Dontje writes: > >> Dave what version of Grid

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Ralph Castain
Can you run anything under TM? Try running "hostname" directly from Torque to see if anything works at all. The error message is telling you that the Torque daemon on the remote node reported a failure when trying to launch the OMPI daemon. Could be that Torque isn't setup to forward environmen

[OMPI users] Displaying MAIN in Totalview

2011-03-21 Thread David Turner
Hi, About a month ago, this topic was discussed with no real resolution: http://www.open-mpi.org/community/lists/users/2011/02/15538.php We noticed the same problem (TV does not display the user's MAIN routine upon initial startup), and contacted the TV developers. They suggested a simple OMPI

Re: [OMPI users] 1.5.3 and SGE integration?

2011-03-21 Thread Dave Love
Ralph Castain writes: > Just looking at this for another question. Yes, SGE integration is broken in > 1.5. Looking at how to fix now. > > Meantime, you can get it work by adding "-mca plm ^rshd" to your mpirun cmd > line. Thanks. I'd forgotten about plm when checking, though I guess that wou

Re: [OMPI users] 1.5.3 and SGE integration?

2011-03-21 Thread Ralph Castain
On Mar 21, 2011, at 11:12 AM, Dave Love wrote: > Ralph Castain writes: > >> Just looking at this for another question. Yes, SGE integration is broken in >> 1.5. Looking at how to fix now. >> >> Meantime, you can get it work by adding "-mca plm ^rshd" to your mpirun cmd >> line. > > Thanks.

Re: [OMPI users] Displaying MAIN in Totalview

2011-03-21 Thread Ralph Castain
Ick - appears that got dropped a long time ago. I'll add it back in and post a CMR for 1.4 and 1.5 series. Thanks! Ralph On Mar 21, 2011, at 11:08 AM, David Turner wrote: > Hi, > > About a month ago, this topic was discussed with no real resolution: > > http://www.open-mpi.org/community/list

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
I am not sure if there is any extra configuration necessary for torque to forward the environment. I have included the output of printenv for an interactive qsub session. I am really at a loss here because I never had this much difficulty making torque run with openmpi. It has been mostly a good

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Jeff Squyres
I no longer run Torque on my cluster, so my Torqueology is pretty rusty -- but I think there's a Torque command to launch on remote nodes. tmrsh or pbsrsh or something like that...? Try that and make sure it works. Open MPI should be using the same API as that command under the covers. I als

Re: [OMPI users] Displaying MAIN in Totalview

2011-03-21 Thread Peter Thompson
Gee, I had tried posting that info earlier today, but my post was rejected because my email address has changed. This is as much a test of that address change request as it is a confirmation of the info Dave reports. (Of course I'm the one who sent them the info, so it's only a little self-s

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Ralph Castain
On Mar 21, 2011, at 11:53 AM, Randall Svancara wrote: > I am not sure if there is any extra configuration necessary for torque > to forward the environment. I have included the output of printenv > for an interactive qsub session. I am really at a loss here because I > never had this much diffi

Re: [OMPI users] Displaying MAIN in Totalview

2011-03-21 Thread Jeff Squyres
Welcome back, Peter. :-) On Mar 21, 2011, at 2:02 PM, Peter Thompson wrote: > Gee, I had tried posting that info earlier today, but my post was rejected > because my email address has changed. This is as much a test of that address > change request as it is a confirmation of the info Dave re

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Ralph Castain
On Mar 21, 2011, at 11:59 AM, Jeff Squyres wrote: > I no longer run Torque on my cluster, so my Torqueology is pretty rusty -- > but I think there's a Torque command to launch on remote nodes. tmrsh or > pbsrsh or something like that...? pbsrsh, IIRC So run pbsrsh printenv to see the enviro

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
I added that temp directory in, but it does not seem to make a difference either way. It was just to illustrate that I was trying specify the temp directory in another place. I was under the impression that running mpiexec in a torque/qsub interactive session would be similar to running torque wi

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
Ok, Let me give this a try. Thanks for all your helpful suggestions. On Mon, Mar 21, 2011 at 11:10 AM, Ralph Castain wrote: > > On Mar 21, 2011, at 11:59 AM, Jeff Squyres wrote: > >> I no longer run Torque on my cluster, so my Torqueology is pretty rusty -- >> but I think there's a Torque comma

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Brock Palen
On Mar 21, 2011, at 1:59 PM, Jeff Squyres wrote: > I no longer run Torque on my cluster, so my Torqueology is pretty rusty -- > but I think there's a Torque command to launch on remote nodes. tmrsh or > pbsrsh or something like that...? pbsdsh If TM is working pbsdsh should work fine. Torque+

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
Ok, these are good things to check. I am going to follow through with this in the next hour after our GPFS upgrade. Thanks!!! On Mon, Mar 21, 2011 at 11:14 AM, Brock Palen wrote: > On Mar 21, 2011, at 1:59 PM, Jeff Squyres wrote: > >> I no longer run Torque on my cluster, so my Torqueology is p

Re: [OMPI users] Displaying MAIN in Totalview

2011-03-21 Thread Dominik Goeddeke
Hi, for what it's worth: Same thing happens with DDT. OpenMPI 1.2.x runs fine, later versions (at least 1.4.x and newer) let DDT bail out with "Could not break at function MPIR_Breakpoint". DDT has something like "OpenMPI (compatibility mode)" in its session launch dialog, with this setting

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
Hi. The pbsdsh tool is great. I ran an interactive qsub session (qsub -I -lnodes=2:ppn=12) and then rand the pbsdsh tool like this: [rsvancara@node164 ~]$ /usr/local/bin/pbsdsh -h node164 printenv PATH=/bin:/usr/bin LANG=C PBS_O_HOME=/home/admins/rsvancara PBS_O_LANG=en_US.UTF-8 PBS_O_LOGNAME=r

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Ralph Castain
mpiexec doesn't use pbsdsh (we use a TM API), but the affect is the same. Been so long since I ran on a Torque machine, though, that I honestly don't remember how to set the LD_LIBRARY_PATH on the backend. Do you have a sys admin there whom you could ask? Or you could ping the Torque list about

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
Yeah, the system admin is me lol.and this is a new system which I am frantically trying to work out all the bugs. Torque and MPI are my last hurdles to overcome. But I have already been through some faulty infiniband equipment, bad memory and bad drives.which is to be expected on a cluste

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
I upgraded from torque 2.4.7 to torque version 2.5.5 and everything works as expected. I am not sure if it is how the old RPMs were compiled or if it is a version problem. In any case, I learned a lot more about Torque and OpenMPI so it is not a total waste of time and effort. Thanks for everyon

[OMPI users] Is there an mca parameter equivalent to -bind-to-core?

2011-03-21 Thread Gustavo Correa
Dear OpenMPI Pros Is there an MCA parameter that would do the same as the mpiexec switch '-bind-to-core'? I.e., something that I could set up not in the mpiexec command line, but for the whole cluster, or for an user, etc. In the past I used '-mca mpi mpi_paffinity_alone=1'. But that was before

Re: [OMPI users] Is there an mca parameter equivalent to -bind-to-core?

2011-03-21 Thread Eugene Loh
Gustavo Correa wrote: Dear OpenMPI Pros Is there an MCA parameter that would do the same as the mpiexec switch '-bind-to-core'? I.e., something that I could set up not in the mpiexec command line, but for the whole cluster, or for an user, etc. In the past I used '-mca mpi mpi_paffinity_alone