[OMPI users] mpirun not working on more than one node

2009-11-17 Thread Laurin Müller
Hi, i want to build a cluster with openmpi. 2 nodes: node 1: 4 x Amd Quad Core, ubuntu 9.04, openmpi 1.3.2 node 2: Sony PS3, ubuntu 9.04, openmpi 1.3 both can connect with ssh to each other and to itself without passwd. I can run the sample proramm pi.c on both nodes seperatly (see below).

[OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread Michael Sternberg
Dear readers, With OpenMPI, how would one go about requesting to load environment modules (of the http://modules.sourceforge.net/ kind) on remote nodes, augmenting those normally loaded there by shell dotfiles? Background: I run a RHEL-5/CentOS-5 cluster. I load a bunch of default modules t

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread David Singleton
Hi Michael, I'm not sure why you dont see Open MPI behaving like other MPI's w.r.t. modules/environment on remote MPI tasks - we do. xe:~ > qsub -q express -lnodes=2:ppn=8,walltime=10:00,vmem=2gb -I qsub: waiting for job 376366.xepbs to start qsub: job 376366.xepbs ready [dbs900@x27 ~]$ module

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread Michael Sternberg
Hi David, Hmm, your demo is well-chosen and crystal-clear, yet the output is unexpected. I do not see environment vars passed by default here: login3$ qsub -l nodes=2:ppn=1 -I qsub: waiting for job 34683.mds01 to start qsub: job 34683.mds01 ready n102$ mpirun -n 2 -machinefile $PBS_NODEFILE h

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread David Singleton
I can see the difference - we built Open MPI with tm support. For some reason, I thought mpirun fed its environment to orted (after orted is launched) so orted can pass it on to MPI tasks. That should be portable between different launch mechanisms. But it looks like tm launches orted with the

Re: [OMPI users] mpirun not working on more than one node

2009-11-17 Thread Ralph Castain
Your cmd line is telling OMPI to run 17 processes. Since your hostfile indicates that only 16 of them are to run on 10.4.23.107 (which I assume is your PS3 node?), 1 process is going to be run on 10.4.1.23 (I assume this is node1?). I would guess that the executable is compiled to run on the PS

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread Ralph Castain
Not exactly. It completely depends on how Torque was setup - OMPI isn't forwarding the environment. Torque is. We made a design decision at the very beginning of the OMPI project not to forward non-OMPI envars unless directed to do so by the user. I'm afraid I disagree with Michael's claim that

Re: [OMPI users] mpirun not working on more than one node

2009-11-17 Thread Laurin Müller
>>> Ralph Castain 11/17/09 4:04 PM >>> >Your cmd line is telling OMPI to run 17 processes. Since your hostfile indicates that only 16 of them are to >run on 10.4.23.107 (which I assume is your PS3 node?), 1 process is going to be run on 10.4.1.23 (I assume >this is node1?). node1 has 16 Cores (4 x

Re: [OMPI users] mpirun not working on more than one node

2009-11-17 Thread Lenny Verkhovsky
I noticed that you also have different versions of OMPI. You have 1.3.2 on node1 and 1.3 on node2. can you try to put same versions of OMPI on both nodes. can you also try running np 16 on node1 when you try running separately. Lenny. On Tue, Nov 17, 2009 at 5:45 PM, Laurin Müller wrote: > > > >>

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread Michael Sternberg
Hi, On Nov 17, 2009, at 9:10 , Ralph Castain wrote: > Not exactly. It completely depends on how Torque was setup - OMPI isn't > forwarding the environment. Torque is. I actually tried compiling OMPI with the tm interface a couple of versions back for both packages but ran into memory trouble, w

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread Michael Sternberg
On Nov 17, 2009, at 10:17 , Michael Sternberg wrote: On Nov 17, 2009, at 9:10 , Ralph Castain wrote: Not exactly. It completely depends on how Torque was setup - OMPI isn't forwarding the environment. Torque is. I actually tried compiling OMPI with the tm interface a couple of versions back

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread David Singleton
Hi Ralph, Now I'm in a quandry - if I show you that its actually Open MPI that is propagating the environment then you are likely to "fix it" and then tm users will lose a nice feature. :-) Can I suggest that "least surprise" would require that MPI tasks get exactly the same environment/limits

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread Ralph Castain
Ah - not good. It is clearly a programming error. I'll have to review the other launchers and consult the others in the project to decide on the proper course of action. Thanks On Nov 17, 2009, at 1:49 PM, David Singleton wrote: > > Hi Ralph, > > Now I'm in a quandry - if I show you that its

[OMPI users] Problem on openmpi run

2009-11-17 Thread Jiaye Li
Dear users I installed openmpi 1.3.3 on my PC (single core & quad-processes). The compilation reported no error and I have found the executable file in the configure directory. But when I try to test it, I met a problem. I tested it with Vasp and PWscf programs, respectively. I typed "mpirun -np

Re: [OMPI users] Problem on openmpi run

2009-11-17 Thread Ralph Castain
On Nov 17, 2009, at 7:39 PM, Jiaye Li wrote: > Dear users > > I installed openmpi 1.3.3 on my PC (single core & quad-processes). The > compilation reported no error and I have found the executable file in the > configure directory. But when I try to test it, I met a problem. > > I tested it

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread Ralph Castain
Sorry I didn't answer more completely before - a tad tied up today with network problems :-/ Actually, both you and Michael pointed out the "flaw" in your own reasoning, and hit the reason why we -don't- forward environment. It is obvious, for example, that you don't want to forward HOSTNAME an