Hi all,
anyone ever seen an error like this? Seems like I have some setting
wrong in opemmpi. I thought I had it setup like the other machines but
seems as though I have missed something. I only get the error when
adding machine "fs1" to the hostfile list. The other 40+ machines seem
fine.
It means you started the jobs ok (via ssh) but Open MPI wasn't able to
open TCP sockets between the two MPI processes. Open MPI needs to be
able to communicate via random TCP ports between its MPI processes.
On Mar 18, 2009, at 8:39 AM, Bernhard Knapp wrote:
Hey again,
I tried to build a
FWIW, two other people said the same thing already:
http://www.open-mpi.org/community/lists/users/2009/03/8479.php
http://www.open-mpi.org/community/lists/users/2009/03/8481.php
:-)
On Mar 18, 2009, at 4:51 AM, Reuti wrote:
Bernhard,
Am 18.03.2009 um 09:19 schrieb Bernhard Knapp:
> come on
>
> At this FAQ, we show an example of a parallel environment setup.
> http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge
>
> I am wondering if the control_slaves needs to be TRUE.
> And double check the that the PE (pavtest) is on the list for the
> queue
> (also mentioned at the FAQ
I can't remember if I've forwarded this to the OMPI lists before;
pardon if you have seen this before. I have one of these books and I
find it quite handy. IMHO: it's quite a steal for US$25 (~600 pages).
Begin forwarded message:
From: "Rolf Rabenseifner"
Date: March 18, 2009 10:21:31 AM
On 03/18/09 09:52, Reuti wrote:
Hi,
Am 18.03.2009 um 14:25 schrieb Rene Salmon:
Thanks for the help. I only use the machine file to run outside of SGE
just to test/prove that things work outside of SGE.
aha. Did you compile Open MPI 1.3 with the SGE option?
When I run with in SGE here is
>
> aha. Did you compile Open MPI 1.3 with the SGE option?
>
Yes I did.
hpcp7781(salmr0)142:ompi_info |grep grid
MCA ras: gridengine (MCA v2.0, API v2.0, Component
v1.3)
>
> > setenv LD_LIBRARY_PATH /bphpc7/vol0/salmr0/ompi/lib
>
> Maybe you have to set this LD_LIBRARY_PAT
Hi,
Am 18.03.2009 um 14:25 schrieb Rene Salmon:
Thanks for the help. I only use the machine file to run outside of
SGE
just to test/prove that things work outside of SGE.
aha. Did you compile Open MPI 1.3 with the SGE option?
When I run with in SGE here is what the job script looks like
Hi,
Thanks for the help. I only use the machine file to run outside of SGE
just to test/prove that things work outside of SGE.
When I run with in SGE here is what the job script looks like:
hpcp7781(salmr0)128:cat simple-job.sh
#!/bin/csh
#
#$ -S /bin/csh
setenv LD_LIBRARY_PATH /bphpc7/vol0/sal
Hey again,
I tried to build a work around via port redirection: iptables -t nat -A
PREROUTING -i eth1 -p tcp --dport 22 -j REDIRECT --to-port 5101
If I do that then I can start the job:
mpirun -np 2 -machinefile /home/bknapp/scripts/machinefile.txt mdrun
-np 2 -nice 0 -s 1fyt_PKYVKQNTLELAT
Hi Bogdan,
Thanks for the information and looking forward to the new OpenMPI feature of
port restriction...
About Debian, I was wondering about that...I've had no problems with it and I
was thinking everything was just done for me; of course, another possibility is
that there was no firewall
On Wed, 18 Mar 2009, Raymond Wan wrote:
Perhaps it has something to do with RH's defaults for the firewall settings?
If your sysadmin uses kickstart to configure the systems, (s)he has to
add 'firewall --disabled'; similar for SELinux which seems to have
caused problems to another person on
Bernhard,
Am 18.03.2009 um 09:19 schrieb Bernhard Knapp:
come on, it must be somehow possible to use openmpi not on port
22!? ;-)
it's not an issue of Open MPI but ssh. You need in your home a file
~/.ssh/config with two lines:
host *
port 1234
or whatever port you need.
-- Reuti
come on, it must be somehow possible to use openmpi not on port 22!? ;-)
--
Message: 3
Date: Tue, 17 Mar 2009 09:45:29 +0100
From: Bernhard Knapp
Subject: [OMPI users] open mpi on non standard ssh port
To: us...@open-mpi.org
Message-ID: <49bf6329.8090...@meduniwien
Hi,
it shouldn't be necessary to supply a machinefile, as the one
generated by SGE is taken automatically (i.e. the granted nodes are
honored). You submitted the job requesting a PE?
-- Reuti
Am 18.03.2009 um 04:51 schrieb Salmon, Rene:
Hi,
I have looked through the list archives and
Hi Ron,
Ron Babich wrote:
Thanks for your response. I had noticed your thread, which is why I'm
embarrassed (but happy) to say that it looks like my problem was the
same as yours. I mentioned in my original email that there was no
firewall running, which it turns out was a lie. I think th
16 matches
Mail list logo