Hi Ralph,
Not at the moment - though I imagine we could create one. It is a tad
tricky in that we allow multiple -x options on the cmd line, but we
obviously can't do that with an envar.
why not?
export OMPI_Magic_Variavle="-x LD_LIBRARY_PATH -x PATH"
cold be possible, or not?
I can ad
Hi Jeff,
FWIW, environment variables prefixed with "OMPI_" will automatically be
distributed out to processes.
Of course, but saddingly the variable(s) we want to ditribute aren't
"OMPI_" variable.
Depending on your environment and launcher, your entire environment may
be copied out
On Nov 10, 2009, at 2:48 AM, Paul Kapinos wrote:
Hi Ralph,
Not at the moment - though I imagine we could create one. It is a
tad tricky in that we allow multiple -x options on the cmd line,
but we obviously can't do that with an envar.
why not?
export OMPI_Magic_Variavle="-x LD_LIBRAR
Thanks for the reply Pallab. Firewall is not an issue as I can
passwordless-SSH to/from both machines.
My problem is to deal with 32bit & 64bit architectures simultaneously (and
not with different operating systems). Can it be possible through
open-MPI???
Look forward to the solution!
Thanks,
Yog
Do you see any output from your executables? I.e., are you sure that
it's running the "correct" executables? If so, do you know how far
it's getting in its run before aborting?
On Nov 10, 2009, at 7:36 AM, Yogesh Aher wrote:
Thanks for the reply Pallab. Firewall is not an issue as I can
Dear all,
I am trying to compile openmpi-1.3.3 with intel Fortran and gcc compiler.
In order to compile openmpi I run configure with the following options:
./configure --prefix=/Libraries/openmpi FC=ifort --enable-mpi-f90
OpenMpi compiled just fine, but when I am trying to compile and link my p
Hello,
I am getting some strange results when I enable the MCA parameters:
mpi_yield_when_idle.
What happen is that for MPI programs which do lots of synchronization,
MPI_Barrier and MPI_Wait I get very good speedup (2.x) by turning on the
parameter (e.g. the CG benchmark of the NAS parallel
Hi there,
I'm experiencing some issues using GE6.2U4 and OpenMPI-1.3.3 (with
gridengine compnent).
During any job submission, SGE creates a session directory in $TMPDIR,
named after the job id and the computing node name. This session
directory is created using nobody/nogroup credentials.
Creating a directory with such credentials sounds like a bug in SGE to
me...perhaps an SGE config issue?
Only thing you could do is tell OMPI to use some other directory as
the root for its session dir tree - check "mpirun -h", or ompi_info
for the required option.
But I would first check
Hello,
We've been having a lot of problems where openmpi jobs crash at
startup because the call to lsb_launch fails (we have a ticket open
with Platform about this). Is there a way to disable the lsb_launch
startup mechanism at runtime and revert to ssh? It's easy enough to
recompile without LSF
Thanks for your help Ralph, I'll double check that.
As for the error message received, there might be some inconsistency:
"/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0" is the parent
directory and
"/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0/53199/0/0" is
the subdirectory
What version of OMPI?
On Nov 10, 2009, at 9:49 AM, Chris Walker wrote:
Hello,
We've been having a lot of problems where openmpi jobs crash at
startup because the call to lsb_launch fails (we have a ticket open
with Platform about this). Is there a way to disable the lsb_launch
startup mechani
Hi,
Am 10.11.2009 um 17:55 schrieb Eloi Gaudry:
Thanks for your help Ralph, I'll double check that.
As for the error message received, there might be some
inconsistency: "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-
eg@charlie_0" is the
often /opt/sge is shared across the nodes, while the /
We have modules for both 1.3.2 and 1.3.3 (intel compilers)
Chris
On Tue, Nov 10, 2009 at 11:58 AM, Ralph Castain wrote:
> What version of OMPI?
>
> On Nov 10, 2009, at 9:49 AM, Chris Walker wrote:
>
>> Hello,
>>
>> We've been having a lot of problems where openmpi jobs crash at
>> startup becaus
Thanks for your help Reuti,
I'm using a nfs-shared directory (/opt/sge/tmp), exported from the
master node to all others computing nodes.
with for /etc/export on server (named moe.fft): /opt/sge
192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
/etc/fstab on client:
Just add
plm = rsh
to your default mca param file.
You don't need to reconfigure or rebuild OMPI
On Nov 10, 2009, at 10:16 AM, Chris Walker wrote:
We have modules for both 1.3.2 and 1.3.3 (intel compilers)
Chris
On Tue, Nov 10, 2009 at 11:58 AM, Ralph Castain
wrote:
What version of OMP
Perfect! Thanks very much,
Chris
On Tue, Nov 10, 2009 at 12:22 PM, Ralph Castain wrote:
> Just add
>
> plm = rsh
>
> to your default mca param file.
>
> You don't need to reconfigure or rebuild OMPI
>
> On Nov 10, 2009, at 10:16 AM, Chris Walker wrote:
>
>> We have modules for both 1.3.2 and 1.3
On Tue, Nov 10, 2009 at 03:44:59PM +0200, vasilis gkanis wrote:
>
> I am trying to compile openmpi-1.3.3 with intel Fortran and gcc compiler.
>
> In order to compile openmpi I run configure with the following options:
>
> ./configure --prefix=/Libraries/openmpi FC=ifort --enable-mpi-f90
>
> Ope
Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:
Thanks for your help Reuti,
I'm using a nfs-shared directory (/opt/sge/tmp), exported from the
master node to all others computing nodes.
It's higly advisable to have the "tmpdir" local on each node. When
you use "cd $TMPDIR" in your jobscript,
Reuti,
I'm using "tmpdir" as a shared directory that contains the session
directories created during job submission, not for computing or local
storage. Doesn't the session directory (i.e. job_id.queue_name) need to
be shared among all computing nodes (at least the ones that would be
used wit
Hi,
Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:
Reuti,
I'm using "tmpdir" as a shared directory that contains the session
directories created during job submission, not for computing or
local storage. Doesn't the session directory (i.e.
job_id.queue_name) need to be shared among all comp
Reuti,
The acl here were just added when I tried to force the /opt/sge/tmp
subdirectories to be 777 (which I did when I first encountered the error
of subdirectories creation within OpenMPI). I don't think the info I'll
provide will be meaningfull here:
moe:~# getfacl /opt/sge/tmp
getfacl: R
Hi all.
(sorry for duplication, if it is)
I have to parallelize a CFD code using domain/grid/mesh partitioning among
the processes. Before running, we do not know,
(i) How many processes we will use ( np is unknown)
(ii) A process will have how many neighbouring processes (my_nbrs = ?)
(iii) How
I'm trying to build OpenMPI with Intel compilers, both static and dynamic libs,
then move it to a system that does not have Intel compilers. I don't care
about system libraries or OpenMPI loadable modules being dynamic, that's all
fine. But I need the compiler libs to be statically linked into
amjad ali wrote:
Hi all.
(sorry for duplication, if it is)
I have to parallelize a CFD code using domain/grid/mesh partitioning
among the processes. Before running, we do not know,
(i) How many processes we will use ( np is unknown)
(ii) A process will have how many neighbouring processes (m
I'm having problem getting the mpirun "preload-binary" option to work.
I'm using ubutu8.10 with openmpi 1.3.3, nodes connected with Ethernet cable.
If I copy the executable to client nodes using scp, then do mpirun,
everything works.
But I really want to avoid the copying, so I tried the -prel
It -should- work, but you need password-less ssh setup. See our FAQ
for how to do that, if you are unfamiliar with it.
On Nov 10, 2009, at 2:02 PM, Qing Pang wrote:
I'm having problem getting the mpirun "preload-binary" option to work.
I'm using ubutu8.10 with openmpi 1.3.3, nodes connected
Hi,
I am using MPI_Reduce operation on 122880x400 matrix of doubles. The
parallel job runs on 32 machines, each having different processor in
terms of speed, but the architecture and OS is the same on all
machines (x86_64). The task is a typical map-and-reduce, i.e. each of
the processes
I want to run a number of MPI executables simultaneously in a PBS job.
For example on my system I do 'cat $PBS_NODEFILE' and get a list like
this:
n04
n04
n04
n04
n06
n06
n06
n06
n07
n07
n07
n07
n09
n09
n09
n09
i.e, 16 processors on 4 nodes. from which I can parse into file(s) as
desired. If I w
What version are you trying to do this with?
Reason I ask: in 1.3.x, we introduced relative node syntax for
specifying hosts to use. This would eliminate the need to create the
hostfiles.
You might do a "man orte_hosts" (assuming you installed the man pages)
and see what it says.
Ralph
Yeah, that is "normal". It has to do with unexpected messages.
When you have procs running at significantly different speeds, the
various operations get far enough out of sync that the memory consumed
by recvd messages not yet processed grows too large.
Instead of sticking barriers into you
I'm away from icc help resources, but try the -static-intel compiler
flag.
On Nov 10, 2009, at 2:51 PM, Blosch, Edwin L wrote:
I’m trying to build OpenMPI with Intel compilers, both static and
dynamic libs, then move it to a system that does not have Intel
compilers. I don’t care about s
Am 10.11.2009 um 23:26 schrieb Jeff Squyres:
I'm away from icc help resources, but try the -static-intel
compiler flag.
I also like the compiler specific libs to be linked in statically - I
just rename the *.so to *.so.disabled. So the linker is forced to use
the .a files of the Intel lib
Hi Reuti,
I followed your advice and switched to a local "tmpdir" instead of a
share one. This solved the session directory issue, thanks for your help !
However, I cannot understand how the issue disappeared. Any input would
be welcome as I really like to understand how SGE/OpenMPI could faile
Hi Eloi,
Am 10.11.2009 um 23:42 schrieb Eloi Gaudry:
I followed your advice and switched to a local "tmpdir" instead of
a share one. This solved the session directory issue, thanks for
your help !
what user/group is no listed for the generated temporary directories
(i.e. $TMPDIR)?
-- R
Am 10.11.2009 um 23:51 schrieb Reuti:
Hi Eloi,
Am 10.11.2009 um 23:42 schrieb Eloi Gaudry:
I followed your advice and switched to a local "tmpdir" instead of
a share one. This solved the session directory issue, thanks for
your help !
what user/group is no listed for the generated tempor
Am 11.11.2009 um 00:03 schrieb Eloi Gaudry:
The user/group used to generate the temporary directories was
nobody/nogroup, when using a shared $tmpdir.
Now that I'm using a local $tmpdir (one for each node, not
distributed over nfs), the right credentials (i.e. my username/
groupname) are use
On any execution node, creating a subdirectory of /opt/sge/tmp (i.e.
creating a session directory inside $TMPDIR) results in a new directory
own by the user/group that submitted the job (not nobody/nogroup).
If I switch back to a shared /opt/sge/tmp directory, all session
directories created by
To avoid misunderstandings:
Am 11.11.2009 um 00:19 schrieb Eloi Gaudry:
On any execution node, creating a subdirectory of /opt/sge/tmp
(i.e. creating a session directory inside $TMPDIR) results in a new
directory own by the user/group that submitted the job (not nobody/
nogroup).
$TMPDIR
This is what I did (create by hand /opt/sge/tmp/test on an execution
host log as a regular cluster user).
Eloi
On 11/11/2009 00:26, Reuti wrote:
To avoid misunderstandings:
Am 11.11.2009 um 00:19 schrieb Eloi Gaudry:
On any execution node, creating a subdirectory of /opt/sge/tmp (i.e.
creat
Am 11.11.2009 um 00:29 schrieb Eloi Gaudry:
This is what I did (create by hand /opt/sge/tmp/test on an
execution host log as a regular cluster user).
Then we end up where I started to think first, but I missed the
implied default: can you export /opt/sge with "no_root_squash" and
reload t
Ralph,
I am using 1.3.2, so the relative node syntax certainly seems the way to
go. However, I seem to be missing something. On the 'orte_hosts' man
page near the top is the simple example:
mpirun -pernode -host +n1,+n2 ./app1 : -host +n3,+n4 ./app2
I set up my job to run on 4 nodes (4 proces
You can use the relative host syntax, but you cannot use a "pernode"
or "npernode" option when you have more than one application on the
cmd line. You have to specify the number of procs for each
application, as the error message says. :-)
IIRC, the reason was that we couldn't decide on how
Hi,
I have a problem with sending/receiving large buffers when using
openmpi (version 1.3.3), e.g.,
MPI_Allreduce(sbuf, rbuf, count, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD);
with count=18000 (this problem does not appear to be unique for
Allreduce, but occurs with Reduce, Bcats as well; maybe m
44 matches
Mail list logo