Re: [OMPI users] an environment variable with same meaning than the -x option of mpiexec

2009-11-10 Thread Paul Kapinos
Hi Ralph, Not at the moment - though I imagine we could create one. It is a tad tricky in that we allow multiple -x options on the cmd line, but we obviously can't do that with an envar. why not? export OMPI_Magic_Variavle="-x LD_LIBRARY_PATH -x PATH" cold be possible, or not? I can ad

Re: [OMPI users] an environment variable with same meaning than the-x option of mpiexec

2009-11-10 Thread Paul Kapinos
Hi Jeff, FWIW, environment variables prefixed with "OMPI_" will automatically be distributed out to processes. Of course, but saddingly the variable(s) we want to ditribute aren't "OMPI_" variable. Depending on your environment and launcher, your entire environment may be copied out

Re: [OMPI users] an environment variable with same meaning than the -x option of mpiexec

2009-11-10 Thread Ralph Castain
On Nov 10, 2009, at 2:48 AM, Paul Kapinos wrote: Hi Ralph, Not at the moment - though I imagine we could create one. It is a tad tricky in that we allow multiple -x options on the cmd line, but we obviously can't do that with an envar. why not? export OMPI_Magic_Variavle="-x LD_LIBRAR

Re: [OMPI users] Openmpi on Heterogeneous environment

2009-11-10 Thread Yogesh Aher
Thanks for the reply Pallab. Firewall is not an issue as I can passwordless-SSH to/from both machines. My problem is to deal with 32bit & 64bit architectures simultaneously (and not with different operating systems). Can it be possible through open-MPI??? Look forward to the solution! Thanks, Yog

Re: [OMPI users] Openmpi on Heterogeneous environment

2009-11-10 Thread Jeff Squyres
Do you see any output from your executables? I.e., are you sure that it's running the "correct" executables? If so, do you know how far it's getting in its run before aborting? On Nov 10, 2009, at 7:36 AM, Yogesh Aher wrote: Thanks for the reply Pallab. Firewall is not an issue as I can

[OMPI users] ipo: warning #11009: file format not recognized for /Libraries_intel/openmpi/lib/libmpi.so

2009-11-10 Thread vasilis gkanis
Dear all, I am trying to compile openmpi-1.3.3 with intel Fortran and gcc compiler. In order to compile openmpi I run configure with the following options: ./configure --prefix=/Libraries/openmpi FC=ifort --enable-mpi-f90 OpenMpi compiled just fine, but when I am trying to compile and link my p

[OMPI users] mpi_yield_when_idle effects

2009-11-10 Thread Simone Pellegrini
Hello, I am getting some strange results when I enable the MCA parameters: mpi_yield_when_idle. What happen is that for MPI programs which do lots of synchronization, MPI_Barrier and MPI_Wait I get very good speedup (2.x) by turning on the parameter (e.g. the CG benchmark of the NAS parallel

[OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
Hi there, I'm experiencing some issues using GE6.2U4 and OpenMPI-1.3.3 (with gridengine compnent). During any job submission, SGE creates a session directory in $TMPDIR, named after the job id and the computing node name. This session directory is created using nobody/nogroup credentials.

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Ralph Castain
Creating a directory with such credentials sounds like a bug in SGE to me...perhaps an SGE config issue? Only thing you could do is tell OMPI to use some other directory as the root for its session dir tree - check "mpirun -h", or ompi_info for the required option. But I would first check

[OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Chris Walker
Hello, We've been having a lot of problems where openmpi jobs crash at startup because the call to lsb_launch fails (we have a ticket open with Platform about this). Is there a way to disable the lsb_launch startup mechanism at runtime and revert to ssh? It's easy enough to recompile without LSF

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
Thanks for your help Ralph, I'll double check that. As for the error message received, there might be some inconsistency: "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0" is the parent directory and "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0/53199/0/0" is the subdirectory

Re: [OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Ralph Castain
What version of OMPI? On Nov 10, 2009, at 9:49 AM, Chris Walker wrote: Hello, We've been having a lot of problems where openmpi jobs crash at startup because the call to lsb_launch fails (we have a ticket open with Platform about this). Is there a way to disable the lsb_launch startup mechani

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Hi, Am 10.11.2009 um 17:55 schrieb Eloi Gaudry: Thanks for your help Ralph, I'll double check that. As for the error message received, there might be some inconsistency: "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- eg@charlie_0" is the often /opt/sge is shared across the nodes, while the /

Re: [OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Chris Walker
We have modules for both 1.3.2 and 1.3.3 (intel compilers) Chris On Tue, Nov 10, 2009 at 11:58 AM, Ralph Castain wrote: > What version of OMPI? > > On Nov 10, 2009, at 9:49 AM, Chris Walker wrote: > >> Hello, >> >> We've been having a lot of problems where openmpi jobs crash at >> startup becaus

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
Thanks for your help Reuti, I'm using a nfs-shared directory (/opt/sge/tmp), exported from the master node to all others computing nodes. with for /etc/export on server (named moe.fft): /opt/sge 192.168.0.0/255.255.255.0(rw,sync,no_subtree_check) /etc/fstab on client:

Re: [OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Ralph Castain
Just add plm = rsh to your default mca param file. You don't need to reconfigure or rebuild OMPI On Nov 10, 2009, at 10:16 AM, Chris Walker wrote: We have modules for both 1.3.2 and 1.3.3 (intel compilers) Chris On Tue, Nov 10, 2009 at 11:58 AM, Ralph Castain wrote: What version of OMP

Re: [OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Chris Walker
Perfect! Thanks very much, Chris On Tue, Nov 10, 2009 at 12:22 PM, Ralph Castain wrote: > Just add > > plm = rsh > > to your default mca param file. > > You don't need to reconfigure or rebuild OMPI > > On Nov 10, 2009, at 10:16 AM, Chris Walker wrote: > >> We have modules for both 1.3.2 and 1.3

Re: [OMPI users] ipo: warning #11009: file format not recognized for /Libraries_intel/openmpi/lib/libmpi.so

2009-11-10 Thread Nifty Tom Mitchell
On Tue, Nov 10, 2009 at 03:44:59PM +0200, vasilis gkanis wrote: > > I am trying to compile openmpi-1.3.3 with intel Fortran and gcc compiler. > > In order to compile openmpi I run configure with the following options: > > ./configure --prefix=/Libraries/openmpi FC=ifort --enable-mpi-f90 > > Ope

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Am 10.11.2009 um 18:20 schrieb Eloi Gaudry: Thanks for your help Reuti, I'm using a nfs-shared directory (/opt/sge/tmp), exported from the master node to all others computing nodes. It's higly advisable to have the "tmpdir" local on each node. When you use "cd $TMPDIR" in your jobscript,

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
Reuti, I'm using "tmpdir" as a shared directory that contains the session directories created during job submission, not for computing or local storage. Doesn't the session directory (i.e. job_id.queue_name) need to be shared among all computing nodes (at least the ones that would be used wit

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Hi, Am 10.11.2009 um 19:01 schrieb Eloi Gaudry: Reuti, I'm using "tmpdir" as a shared directory that contains the session directories created during job submission, not for computing or local storage. Doesn't the session directory (i.e. job_id.queue_name) need to be shared among all comp

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
Reuti, The acl here were just added when I tried to force the /opt/sge/tmp subdirectories to be 777 (which I did when I first encountered the error of subdirectories creation within OpenMPI). I don't think the info I'll provide will be meaningfull here: moe:~# getfacl /opt/sge/tmp getfacl: R

[OMPI users] Coding help requested

2009-11-10 Thread amjad ali
Hi all. (sorry for duplication, if it is) I have to parallelize a CFD code using domain/grid/mesh partitioning among the processes. Before running, we do not know, (i) How many processes we will use ( np is unknown) (ii) A process will have how many neighbouring processes (my_nbrs = ?) (iii) How

[OMPI users] How do you get static linkage for Intel compiler libs for the orterun executable?

2009-11-10 Thread Blosch, Edwin L
I'm trying to build OpenMPI with Intel compilers, both static and dynamic libs, then move it to a system that does not have Intel compilers. I don't care about system libraries or OpenMPI loadable modules being dynamic, that's all fine. But I need the compiler libs to be statically linked into

Re: [OMPI users] Coding help requested

2009-11-10 Thread Eugene Loh
amjad ali wrote: Hi all. (sorry for duplication, if it is) I have to parallelize a CFD code using domain/grid/mesh partitioning among the processes. Before running, we do not know, (i) How many processes we will use ( np is unknown) (ii) A process will have how many neighbouring processes (m

[OMPI users] Problem with mpirun -preload-binary option

2009-11-10 Thread Qing Pang
I'm having problem getting the mpirun "preload-binary" option to work. I'm using ubutu8.10 with openmpi 1.3.3, nodes connected with Ethernet cable. If I copy the executable to client nodes using scp, then do mpirun, everything works. But I really want to avoid the copying, so I tried the -prel

Re: [OMPI users] Problem with mpirun -preload-binary option

2009-11-10 Thread Ralph Castain
It -should- work, but you need password-less ssh setup. See our FAQ for how to do that, if you are unfamiliar with it. On Nov 10, 2009, at 2:02 PM, Qing Pang wrote: I'm having problem getting the mpirun "preload-binary" option to work. I'm using ubutu8.10 with openmpi 1.3.3, nodes connected

[OMPI users] System hang-up on MPI_Reduce

2009-11-10 Thread Glembek Ondřej
Hi, I am using MPI_Reduce operation on 122880x400 matrix of doubles. The parallel job runs on 32 machines, each having different processor in terms of speed, but the architecture and OS is the same on all machines (x86_64). The task is a typical map-and-reduce, i.e. each of the processes

[OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Tom Rosmond
I want to run a number of MPI executables simultaneously in a PBS job. For example on my system I do 'cat $PBS_NODEFILE' and get a list like this: n04 n04 n04 n04 n06 n06 n06 n06 n07 n07 n07 n07 n09 n09 n09 n09 i.e, 16 processors on 4 nodes. from which I can parse into file(s) as desired. If I w

Re: [OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Ralph Castain
What version are you trying to do this with? Reason I ask: in 1.3.x, we introduced relative node syntax for specifying hosts to use. This would eliminate the need to create the hostfiles. You might do a "man orte_hosts" (assuming you installed the man pages) and see what it says. Ralph

Re: [OMPI users] System hang-up on MPI_Reduce

2009-11-10 Thread Ralph Castain
Yeah, that is "normal". It has to do with unexpected messages. When you have procs running at significantly different speeds, the various operations get far enough out of sync that the memory consumed by recvd messages not yet processed grows too large. Instead of sticking barriers into you

Re: [OMPI users] How do you get static linkage for Intel compiler libsfor the orterun executable?

2009-11-10 Thread Jeff Squyres
I'm away from icc help resources, but try the -static-intel compiler flag. On Nov 10, 2009, at 2:51 PM, Blosch, Edwin L wrote: I’m trying to build OpenMPI with Intel compilers, both static and dynamic libs, then move it to a system that does not have Intel compilers. I don’t care about s

Re: [OMPI users] How do you get static linkage for Intel compiler libsfor the orterun executable?

2009-11-10 Thread Reuti
Am 10.11.2009 um 23:26 schrieb Jeff Squyres: I'm away from icc help resources, but try the -static-intel compiler flag. I also like the compiler specific libs to be linked in statically - I just rename the *.so to *.so.disabled. So the linker is forced to use the .a files of the Intel lib

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
Hi Reuti, I followed your advice and switched to a local "tmpdir" instead of a share one. This solved the session directory issue, thanks for your help ! However, I cannot understand how the issue disappeared. Any input would be welcome as I really like to understand how SGE/OpenMPI could faile

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Hi Eloi, Am 10.11.2009 um 23:42 schrieb Eloi Gaudry: I followed your advice and switched to a local "tmpdir" instead of a share one. This solved the session directory issue, thanks for your help ! what user/group is no listed for the generated temporary directories (i.e. $TMPDIR)? -- R

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Am 10.11.2009 um 23:51 schrieb Reuti: Hi Eloi, Am 10.11.2009 um 23:42 schrieb Eloi Gaudry: I followed your advice and switched to a local "tmpdir" instead of a share one. This solved the session directory issue, thanks for your help ! what user/group is no listed for the generated tempor

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Am 11.11.2009 um 00:03 schrieb Eloi Gaudry: The user/group used to generate the temporary directories was nobody/nogroup, when using a shared $tmpdir. Now that I'm using a local $tmpdir (one for each node, not distributed over nfs), the right credentials (i.e. my username/ groupname) are use

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
On any execution node, creating a subdirectory of /opt/sge/tmp (i.e. creating a session directory inside $TMPDIR) results in a new directory own by the user/group that submitted the job (not nobody/nogroup). If I switch back to a shared /opt/sge/tmp directory, all session directories created by

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
To avoid misunderstandings: Am 11.11.2009 um 00:19 schrieb Eloi Gaudry: On any execution node, creating a subdirectory of /opt/sge/tmp (i.e. creating a session directory inside $TMPDIR) results in a new directory own by the user/group that submitted the job (not nobody/ nogroup). $TMPDIR

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
This is what I did (create by hand /opt/sge/tmp/test on an execution host log as a regular cluster user). Eloi On 11/11/2009 00:26, Reuti wrote: To avoid misunderstandings: Am 11.11.2009 um 00:19 schrieb Eloi Gaudry: On any execution node, creating a subdirectory of /opt/sge/tmp (i.e. creat

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Am 11.11.2009 um 00:29 schrieb Eloi Gaudry: This is what I did (create by hand /opt/sge/tmp/test on an execution host log as a regular cluster user). Then we end up where I started to think first, but I missed the implied default: can you export /opt/sge with "no_root_squash" and reload t

Re: [OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Tom Rosmond
Ralph, I am using 1.3.2, so the relative node syntax certainly seems the way to go. However, I seem to be missing something. On the 'orte_hosts' man page near the top is the simple example: mpirun -pernode -host +n1,+n2 ./app1 : -host +n3,+n4 ./app2 I set up my job to run on 4 nodes (4 proces

Re: [OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Ralph Castain
You can use the relative host syntax, but you cannot use a "pernode" or "npernode" option when you have more than one application on the cmd line. You have to specify the number of procs for each application, as the error message says. :-) IIRC, the reason was that we couldn't decide on how

[OMPI users] maximum value for count argument

2009-11-10 Thread Martin Siegert
Hi, I have a problem with sending/receiving large buffers when using openmpi (version 1.3.3), e.g., MPI_Allreduce(sbuf, rbuf, count, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD); with count=18000 (this problem does not appear to be unique for Allreduce, but occurs with Reduce, Bcats as well; maybe m