[OMPI users] selected pml cm, but peer [[2469, 1], 0] on compute-0-0 selected pml ob1

2009-03-18 Thread Gary Draving
Hi all, anyone ever seen an error like this? Seems like I have some setting wrong in opemmpi. I thought I had it setup like the other machines but seems as though I have missed something. I only get the error when adding machine "fs1" to the hostfile list. The other 40+ machines seem fine.

Re: [OMPI users] [Fwd: Re: open mpi on non standard ssh port]

2009-03-18 Thread Jeff Squyres
It means you started the jobs ok (via ssh) but Open MPI wasn't able to open TCP sockets between the two MPI processes. Open MPI needs to be able to communicate via random TCP ports between its MPI processes. On Mar 18, 2009, at 8:39 AM, Bernhard Knapp wrote: Hey again, I tried to build a

Re: [OMPI users] open mpi on non standard ssh port

2009-03-18 Thread Jeff Squyres
FWIW, two other people said the same thing already: http://www.open-mpi.org/community/lists/users/2009/03/8479.php http://www.open-mpi.org/community/lists/users/2009/03/8481.php :-) On Mar 18, 2009, at 4:51 AM, Reuti wrote: Bernhard, Am 18.03.2009 um 09:19 schrieb Bernhard Knapp: > come on

Re: [OMPI users] openmpi 1.3 and gridengine tight integrationproblem

2009-03-18 Thread Rene Salmon
> > At this FAQ, we show an example of a parallel environment setup. > http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge > > I am wondering if the control_slaves needs to be TRUE. > And double check the that the PE (pavtest) is on the list for the > queue > (also mentioned at the FAQ

[OMPI users] Fwd: New MPI-2.1 standard in hardcover - the yellow book

2009-03-18 Thread Jeff Squyres
I can't remember if I've forwarded this to the OMPI lists before; pardon if you have seen this before. I have one of these books and I find it quite handy. IMHO: it's quite a steal for US$25 (~600 pages). Begin forwarded message: From: "Rolf Rabenseifner" Date: March 18, 2009 10:21:31 AM

Re: [OMPI users] openmpi 1.3 and gridengine tight integrationproblem

2009-03-18 Thread Rolf Vandevaart
On 03/18/09 09:52, Reuti wrote: Hi, Am 18.03.2009 um 14:25 schrieb Rene Salmon: Thanks for the help. I only use the machine file to run outside of SGE just to test/prove that things work outside of SGE. aha. Did you compile Open MPI 1.3 with the SGE option? When I run with in SGE here is

Re: [OMPI users] openmpi 1.3 and gridengine tight integrationproblem

2009-03-18 Thread Rene Salmon
> > aha. Did you compile Open MPI 1.3 with the SGE option? > Yes I did. hpcp7781(salmr0)142:ompi_info |grep grid MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.3) > > > setenv LD_LIBRARY_PATH /bphpc7/vol0/salmr0/ompi/lib > > Maybe you have to set this LD_LIBRARY_PAT

Re: [OMPI users] openmpi 1.3 and gridengine tight integrationproblem

2009-03-18 Thread Reuti
Hi, Am 18.03.2009 um 14:25 schrieb Rene Salmon: Thanks for the help. I only use the machine file to run outside of SGE just to test/prove that things work outside of SGE. aha. Did you compile Open MPI 1.3 with the SGE option? When I run with in SGE here is what the job script looks like

Re: [OMPI users] openmpi 1.3 and gridengine tight integrationproblem

2009-03-18 Thread Rene Salmon
Hi, Thanks for the help. I only use the machine file to run outside of SGE just to test/prove that things work outside of SGE. When I run with in SGE here is what the job script looks like: hpcp7781(salmr0)128:cat simple-job.sh #!/bin/csh # #$ -S /bin/csh setenv LD_LIBRARY_PATH /bphpc7/vol0/sal

[OMPI users] [Fwd: Re: open mpi on non standard ssh port]

2009-03-18 Thread Bernhard Knapp
Hey again, I tried to build a work around via port redirection: iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 22 -j REDIRECT --to-port 5101 If I do that then I can start the job: mpirun -np 2 -machinefile /home/bknapp/scripts/machinefile.txt mdrun -np 2 -nice 0 -s 1fyt_PKYVKQNTLELAT

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-18 Thread Raymond Wan
Hi Bogdan, Thanks for the information and looking forward to the new OpenMPI feature of port restriction... About Debian, I was wondering about that...I've had no problems with it and I was thinking everything was just done for me; of course, another possibility is that there was no firewall

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-18 Thread Bogdan Costescu
On Wed, 18 Mar 2009, Raymond Wan wrote: Perhaps it has something to do with RH's defaults for the firewall settings? If your sysadmin uses kickstart to configure the systems, (s)he has to add 'firewall --disabled'; similar for SELinux which seems to have caused problems to another person on

Re: [OMPI users] open mpi on non standard ssh port

2009-03-18 Thread Reuti
Bernhard, Am 18.03.2009 um 09:19 schrieb Bernhard Knapp: come on, it must be somehow possible to use openmpi not on port 22!? ;-) it's not an issue of Open MPI but ssh. You need in your home a file ~/.ssh/config with two lines: host * port 1234 or whatever port you need. -- Reuti

Re: [OMPI users] open mpi on non standard ssh port

2009-03-18 Thread Bernhard Knapp
come on, it must be somehow possible to use openmpi not on port 22!? ;-) -- Message: 3 Date: Tue, 17 Mar 2009 09:45:29 +0100 From: Bernhard Knapp Subject: [OMPI users] open mpi on non standard ssh port To: us...@open-mpi.org Message-ID: <49bf6329.8090...@meduniwien

Re: [OMPI users] openmpi 1.3 and gridengine tight integration problem

2009-03-18 Thread Reuti
Hi, it shouldn't be necessary to supply a machinefile, as the one generated by SGE is taken automatically (i.e. the granted nodes are honored). You submitted the job requesting a PE? -- Reuti Am 18.03.2009 um 04:51 schrieb Salmon, Rene: Hi, I have looked through the list archives and

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-18 Thread Raymond Wan
Hi Ron, Ron Babich wrote: Thanks for your response. I had noticed your thread, which is why I'm embarrassed (but happy) to say that it looks like my problem was the same as yours. I mentioned in my original email that there was no firewall running, which it turns out was a lie. I think th