Re: [OMPI users] Fwd: [GE users] Open MPI job fails when run thru SGE

Reuti Mon, 2 Feb 2009 01:45:56 -0500

Am 02.02.2009 um 05:44 schrieb Sangamesh B:

On Sun, Feb 1, 2009 at 10:37 PM, Reuti <re...@staff.uni-marburg.de>wrote:
Am 01.02.2009 um 16:00 schrieb Sangamesh B:
On Sat, Jan 31, 2009 at 6:27 PM, Reuti <re...@staff.uni-marburg.de> wrote:
Am 31.01.2009 um 08:49 schrieb Sangamesh B:
On Fri, Jan 30, 2009 at 10:20 PM, Reuti <re...@staff.uni-marburg.de>
wrote:
Am 30.01.2009 um 15:02 schrieb Sangamesh B:
Dear Open MPI,
Do you have a solution for the following problem of Open MPI(1.3)
when run through Grid Engine.

I changed global execd params with H_MEMORYLOCKED=infinity and
restarted the sgeexecd in all nodes.

But still the problem persists:

 $cat err.77.CPMD-OMPI
ssh_exchange_identification: Connection closed by remote host
I think this might already be the reason why it's not working. A
mpihello
program is running fine through SGE?
No.

Any Open MPI parallel job thru SGE runs only if its running on a
single node (i.e. 8processes on 8 cores of a single node). Ifnumberof processes is more than 8, then SGE will schedule it on 2nodes -
the job will fail with the above error.
Now I did a loose integration of Open MPI 1.3 with SGE. The jobruns,
but all 16 processes run on a single node.
What are the entries in `qconf -sconf`for:

rsh_command
rsh_daemon
$ qconf -sconf
global:
execd_spool_dir              /opt/gridengine/default/spool
...
.....
qrsh_command                 /usr/bin/ssh
rsh_command                  /usr/bin/ssh
rlogin_command               /usr/bin/ssh
rsh_daemon                   /usr/sbin/sshd
qrsh_daemon                  /usr/sbin/sshd
reprioritize                 0
Do you must use ssh? Often in a private cluster the rsh based oneis ok, orwith SGE 6.2 the built-in mechanism of SGE. Otherwise pleasefollow this:
http://gridengine.sunsource.net/howto/qrsh_qlogin_ssh.html
I think its better to check once with Open MPI 1.2.8
What is your mpirun command in the jobscript - you are gettingthere thempirun from Open MPI? According to the output below, it's not alooseintegration, but you prepare alraedy a machinefile, which issuperfluous
for
Open MPI.
No. I've not prepared the machinefile for Open MPI.
For Tight integartion job:

/opt/mpi/openmpi/1.3/intel/bin/mpirun -np $NSLOTS
$CPMDBIN/cpmd311-ompi-mkl.x  wf1.in $PP_LIBRARY >
wf1.out_OMPI$NSLOTS.$JOB_ID

For loose integration job:

/opt/mpi/openmpi/1.3/intel/bin/mpirun -np $NSLOTS -hostfile
$TMPDIR/machines  $CPMDBIN/cpmd311-ompi-mkl.x  wf1.in $PP_LIBRARY >
wf1.out_OMPI_$JOB_ID.$NSLOTS
a) you compiled Open MPI with "--with-sge"?
Yes. But ompi_info shows only one component of sge

$ /opt/mpi/openmpi/1.3/intel/bin/ompi_info | grep gridengine
MCA ras: gridengine (MCA v2.0, API v2.0, Componentv1.3)
b) when the $SGE_ROOT variable is set, Open MPI will use a TightIntegration
automatically.
In SGE job submit script, I set SGE_ROOT= <nothing>


This will set the variable to an empty string. You need to use:

unset SGE_ROOT

Despite the mentioned error message on the list, I can run Open MPI1.3 with tight integration into SGE.


-- Reuti

And run a loose integration job. It failed to run with followingerror:
$ cat err.87.Hello-OMPI
[node-0-18.local:08252] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found
in file ess_hnp_module.c at line 126
--------------------------------------------------------------------------It looks like orte_init failed for some reason; your parallelprocess is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_plm_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[node-0-18.local:08252] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found
in file runtime/orte_init.c at line 132
--------------------------------------------------------------------------It looks like orte_init failed for some reason; your parallelprocess is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[node-0-18.local:08252] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found
in file orterun.c at line 454

$ cat out.87.Hello-OMPI
/opt/gridengine/default/spool/node-0-18/active_jobs/87.1/pe_hostfile
ibc18
ibc18
ibc18
ibc18
ibc18
ibc18
ibc18
ibc18
ibc17
ibc17
ibc17
ibc17
ibc17
ibc17
ibc17
ibc17
c) The machine file you presented looks like being for MPICH(1),the syntax
for Open MPI in the machine is different:

ibc17 slots=8
ibc12 slots=8
I tested a helloworld program with Open MPI with machinefile ofstyle MPICH(1).
It works.

So in a loose integration job,
Open MPI may not be able to find $TMPDIR/machines file
Or it might be running in a Tight integration style.
So you would have to adjust the format of the generated file andresetSGE_ROOT inside your jobscript, to force Open MPI to do a looseintegration
only.

-- Reuti
I think I should check with Open MPI 1.2.8. That may work..

Thanks,
Sangamesh
$ cat out.83.Hello-OMPI
/opt/gridengine/default/spool/node-0-17/active_jobs/83.1/pe_hostfile
ibc17
ibc17
ibc17
ibc17
ibc17
ibc17
ibc17
ibc17
ibc12
ibc12
ibc12
ibc12
ibc12
ibc12
ibc12
ibc12
Greetings: 1 of 16 from the node node-0-17.local
Greetings: 10 of 16 from the node node-0-17.local
Greetings: 15 of 16 from the node node-0-17.local
Greetings: 9 of 16 from the node node-0-17.local
Greetings: 14 of 16 from the node node-0-17.local
Greetings: 8 of 16 from the node node-0-17.local
Greetings: 11 of 16 from the node node-0-17.local
Greetings: 12 of 16 from the node node-0-17.local
Greetings: 6 of 16 from the node node-0-17.local
Greetings: 0 of 16 from the node node-0-17.local
Greetings: 5 of 16 from the node node-0-17.local
Greetings: 3 of 16 from the node node-0-17.local
Greetings: 13 of 16 from the node node-0-17.local
Greetings: 4 of 16 from the node node-0-17.local
Greetings: 7 of 16 from the node node-0-17.local
Greetings: 2 of 16 from the node node-0-17.local

But qhost -u <user name> shows that it is scheduled/running on two
nodes.
Any body successful in running Open MPI 1.3 tightly integratedwith SGE?
For a Tight Integration there's a FAQ:

http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge

-- Reuti
Thanks,
Sangamesh
-- Reuti
--------------------------------------------------------------------------
A daemon (pid 31947) died unexpectedly with status 129 while
attempting
to launch so we are aborting.
There may be more information reported by the environment(see above).
This may be because the daemon was unable to find all the needed
shared
libraries on the remote node. You may set yourLD_LIBRARY_PATH to have
the
location of the shared libraries on the remote nodes and thiswill
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------mpirun noticed that the job aborted, but has no info as tothe process
that caused that situation.
--------------------------------------------------------------------------
ssh_exchange_identification: Connection closed by remote host
--------------------------------------------------------------------------mpirun was unable to cleanly terminate the daemons on thenodes shownbelow. Additional manual cleanup may be required - pleaserefer to
the "orte-clean" tool for assistance.
--------------------------------------------------------------------------
    node-0-19.local - daemon did not report back when launched
    node-0-20.local - daemon did not report back when launched
    node-0-21.local - daemon did not report back when launched
    node-0-22.local - daemon did not report back when launched
The hostnames for infiniband interfaces are ibc0, ibc1,ibc2 .. ibc23.May be Open MPI is not able to identify hosts as it is usingnode-0-..
. Is this causing open mpi to fail?

Thanks,
Sangamesh
On Mon, Jan 26, 2009 at 5:09 PM, mihlon <vacl...@fel.cvut.cz>wrote:
Hi,
Hello SGE users,
The cluster is installed with Rocks-4.3, SGE 6.0 & Open MPI1.3.
Open MPI is configured with "--with-sge".
ompi_info shows only one component:
# /opt/mpi/openmpi/1.3/intel/bin/ompi_info | grep gridengine
MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.3)

Is this acceptable?
maybe yes
see: http://www.open-mpi.org/faq/?category=building#build-rte-sge
shell$ ompi_info | grep gridengine
MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.3)
MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.3)
(Specific frameworks and version numbers may vary, dependingon your
version of Open MPI.)
The Open MPI parallel jobs run successfully through commandline,
but
fail when run thru SGE(with -pe orte <slots>).

The error is:

$ cat err.26.Helloworld-PRL
ssh_exchange_identification: Connection closed by remote host
--------------------------------------------------------------------------
A daemon (pid 8462) died unexpectedly with status 129 while
attempting
to launch so we are aborting.

There may be more information reported by the environment (see
above).
This may be because the daemon was unable to find all theneeded
shared
libraries on the remote node. You may set yourLD_LIBRARY_PATH to
have
the
location of the shared libraries on the remote nodes andthis will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the
process
that caused that situation.
--------------------------------------------------------------------------
mpirun: clean termination accomplished
But the same job runs well, if it runs on a single node butwith an
error:

$ cat err.23.Helloworld-PRL
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
--------------------------------------------------------------------------WARNING: There was an error initializing an OpenFabricsdevice.
Local host: node-0-4.local
Local device: mthca0
--------------------------------------------------------------------------
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
[node-0-4.local:07869] 7 more processes have sent help message
help-mpi-btl-openib.txt / error in device init
[node-0-4.local:07869] Set MCA parameter"orte_base_help_aggregate"
to
0 to see all help / error messages

The following link explains the same problem:
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=72398
With this reference, I put 'ulimit -l unlimited' into
/etc/init.d/sgeexecd in all nodes. Restarted the services.
Do not set 'ulimit -l unlimited' in /etc/init.d/sgeexecd
but set it in the SGE:

Run   qconf -mconf   and set    execd_params


frontend$> qconf -sconf
...
execd_params                 H_MEMORYLOCKED=infinity
...


Then restart all your sgeexecd hosts.


Milan
But still the problem persists.

What could be the way out for this?

Thanks,
Sangamesh

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=99133
To unsubscribe from this discussion, e-mail:
[users-unsubscr...@gridengine.sunsource.net].
------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=99461
To unsubscribe from this discussion, e-mail:
[users-unsubscr...@gridengine.sunsource.net].
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Fwd: [GE users] Open MPI job fails when run thru SGE

Reply via email to