date:20091110

Re: [OMPI users] an environment variable with same meaning than the -x option of mpiexec

2009-11-10 Thread Paul Kapinos


Hi Ralph,



Not at the moment - though I imagine we could create one. It is a tad 
tricky in that we allow multiple -x options on the cmd line, but we 
obviously can't do that with an envar.


why not?

export OMPI_Magic_Variavle="-x LD_LIBRARY_PATH -x PATH"
cold be possible, or not?




I can add it to the "to-do" list for a rainy day :-)

That would be great :-)

Thanks for your help!

Paul Kapinos




with the -x option of mpiexec there is a way to distribute environmnet 
variables:


-x   Export  the  specified  environment  variables  to the remote
nodes before executing the  program.


Is there an environment variable ( OMPI_) with the same meaning? 
The writing of environmnet variables on the command line is ugly and 
tedious...


I've searched for this info on OpenMPI web pages for about an hour and 
didn't find the ansver :-/



Thanking you in anticipation,

Paul




--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915


smime.p7s
Description: S/MIME Cryptographic Signature

Re: [OMPI users] an environment variable with same meaning than the-x option of mpiexec

2009-11-10 Thread Paul Kapinos

Hi Jeff,

FWIW, environment variables prefixed with "OMPI_" will automatically be 
distributed out to processes.  

Of course, but saddingly the variable(s) we want to ditribute aren't 
"OMPI_" variable.

Depending on your environment and launcher, your entire environment may 
be copied out to all the processes, anyway (rsh does not, but 
environments like SLURM do), making the OMPI_* and -x mechanisms 
somewhat redundant.

Does this help?

By now I specified the $MPIEXEC variable to "mpiexec -x BLABLABLA" and 
advice the users to use this. This is a bit ugly, but working 
workaround. What i wanted to achieve with my mail, was a less ugly 
solution :o)

Thanks for your help,

Paul Kapinos

Not at the moment - though I imagine we could create one. It is a tad
tricky in that we allow multiple -x options on the cmd line, but we
obviously can't do that with an envar.

The most likely solution would be to specify multiple "-x" equivalents
by separating them with a comma in the envar. It would take some
parsing to make it all work, but not impossible.

I can add it to the "to-do" list for a rainy day :-)

On Nov 6, 2009, at 7:59 AM, Paul Kapinos wrote:

> Dear OpenMPI developer,
>
> with the -x option of mpiexec there is a way to distribute
> environmnet variables:
>
> -x   Export  the  specified  environment  variables  to the
> remote
> nodes before executing the  program.
>
>
> Is there an environment variable ( OMPI_) with the same meaning?
> The writing of environmnet variables on the command line is ugly and
> tedious...
>
> I've searched for this info on OpenMPI web pages for about an hour
> and didn't find the ansver :-/
>
>
> Thanking you in anticipation,
>
> Paul
>
>
>
>
> --
> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
> RWTH Aachen University, Center for Computing and Communication
> Seffenter Weg 23,  D 52074  Aachen (Germany)
> Tel: +49 241/80-24915
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915

smime.p7s
Description: S/MIME Cryptographic Signature

Re: [OMPI users] an environment variable with same meaning than the -x option of mpiexec

2009-11-10 Thread Ralph Castain



On Nov 10, 2009, at 2:48 AM, Paul Kapinos wrote:


Hi Ralph,



Not at the moment - though I imagine we could create one. It is a  
tad tricky in that we allow multiple -x options on the cmd line,  
but we obviously can't do that with an envar.


why not?

export OMPI_Magic_Variavle="-x LD_LIBRARY_PATH -x PATH"
cold be possible, or not?


That is basically what I had in mind, but it now requires that we  
parse it. My point was that you can't do


export OMPI_dash_x="foo"
export OMPI_dash_x="bar"

like you would do on the cmd line itself, so now there has to be a  
special parser for handling the envar separate from the cmd line entry.


Not a big deal - just takes some code...which is why it isn't an  
immediate response.







I can add it to the "to-do" list for a rainy day :-)

That would be great :-)

Thanks for your help!

Paul Kapinos




with the -x option of mpiexec there is a way to distribute  
environmnet variables:


-x   Export  the  specified  environment  variables  to the  
remote

   nodes before executing the  program.


Is there an environment variable ( OMPI_) with the same  
meaning? The writing of environmnet variables on the command line  
is ugly and tedious...


I've searched for this info on OpenMPI web pages for about an hour  
and didn't find the ansver :-/



Thanking you in anticipation,

Paul




--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Openmpi on Heterogeneous environment

2009-11-10 Thread Yogesh Aher

Thanks for the reply Pallab. Firewall is not an issue as I can
passwordless-SSH to/from both machines.
My problem is to deal with 32bit & 64bit architectures simultaneously (and
not with different operating systems). Can it be possible through
open-MPI???

Look forward to the solution!

Thanks,
Yogesh


*From:* Pallab Datta (*datta_at_[hidden]*)

I have had issues for running in cross platforms..ie. Mac OSX and Linux
(Ubuntu)..haven't got it resolved..check firewalls if thats blocking any
communication..

On Thu, Nov 5, 2009 at 7:47 PM, Yogesh Aher  wrote:

> Dear Open-mpi users,
>
> I have installed openmpi on 2 different machines with different
> architectures (INTEL and x86_64) separately (command: ./configure
> --enable-heterogeneous). Compiled executables of the same code for these 2
> arch. Kept these executables on individual machines. Prepared a hostfile
> containing the names of those 2 machines.
> Now, when I want to execute the code (giving command - ./mpirun -hostfile
> machines executable), it doesn't work, giving error message:
>
> MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
> with errorcode 1.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --
> --
> mpirun has exited due to process rank 2 with PID 1712 on
> node studpc1.xxx..xx exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here)
>
> When I keep only one machine-name in the hostfile, then the execution works
> perfect.
>
> Will anybody please guide me to run the program on heterogeneous
> environment using mpirun!
>
> Thanking you,
>
> Sincerely,
> Yogesh
>

Re: [OMPI users] Openmpi on Heterogeneous environment

2009-11-10 Thread Jeff Squyres

Do you see any output from your executables?  I.e., are you sure that  
it's running the "correct" executables?  If so, do you know how far  
it's getting in its run before aborting?



On Nov 10, 2009, at 7:36 AM, Yogesh Aher wrote:

Thanks for the reply Pallab. Firewall is not an issue as I can  
passwordless-SSH to/from both machines.
My problem is to deal with 32bit & 64bit architectures  
simultaneously (and not with different operating systems). Can it be  
possible through open-MPI???


Look forward to the solution!

Thanks,
Yogesh


From: Pallab Datta (datta_at_[hidden])

I have had issues for running in cross platforms..ie. Mac OSX and  
Linux
(Ubuntu)..haven't got it resolved..check firewalls if thats blocking  
any

communication..

On Thu, Nov 5, 2009 at 7:47 PM, Yogesh Aher   
wrote:

Dear Open-mpi users,

I have installed openmpi on 2 different machines with different  
architectures (INTEL and x86_64) separately (command: ./configure -- 
enable-heterogeneous). Compiled executables of the same code for  
these 2 arch. Kept these executables on individual machines.  
Prepared a hostfile containing the names of those 2 machines.
Now, when I want to execute the code (giving command - ./mpirun - 
hostfile machines executable), it doesn't work, giving error message:


MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--
--
mpirun has exited due to process rank 2 with PID 1712 on
node studpc1.xxx..xx exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here)

When I keep only one machine-name in the hostfile, then the  
execution works perfect.


Will anybody please guide me to run the program on heterogeneous  
environment using mpirun!


Thanking you,

Sincerely,
Yogesh

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
jsquy...@cisco.com

[OMPI users] ipo: warning #11009: file format not recognized for /Libraries_intel/openmpi/lib/libmpi.so

2009-11-10 Thread vasilis gkanis

Dear all,

I am trying to compile openmpi-1.3.3 with intel Fortran and gcc compiler.

In order to compile openmpi I run configure with the following options:

./configure --prefix=/Libraries/openmpi FC=ifort --enable-mpi-f90

OpenMpi compiled just fine, but when I am trying to compile and link my program 
against mpi, I get the following error:

ipo: warning #11009: file format not recognized for 
/Libraries_intel/openmpi/lib/libmpi.so
ld: skipping incompatible /Libraries_intel/openmpi/lib/libmpi.so when 
searching for -lmpi
ld: cannot find -lmpi

I have updated the LD_LIBRARY_PATH file.

Does anybody know what this error mean?

Thank you,
Vasilis

[OMPI users] mpi_yield_when_idle effects

2009-11-10 Thread Simone Pellegrini


Hello,
I am getting some strange results when I enable the MCA parameters: 
mpi_yield_when_idle.


What happen is that for MPI programs which do lots of synchronization, 
MPI_Barrier and MPI_Wait I get very good speedup (2.x) by turning on the 
parameter (e.g. the CG benchmark of the NAS parallel benchmarks suite).
I am not oversubscribing nodes, I am running 8 processes in a SMP system 
with exactly 8 physical cores (cache is shared on every 2 cores).


The only way I was explaining this result is because of temperature 
issues that scale down the clock speed of the entire chip if all the 
cores are getting too hot (because of the busy waiting). Anyway I tried 
to replicate the behavior with a trivial (non MPI) code where one core 
is doing some work while the others (belonging to the same chip) are 
busy waiting but I didn't get the same speedup when I switch from the 
busy wait to idle implementation.


Someone of you has any idea why is this happening?

regards, Simone

[OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry


Hi there,

I'm experiencing some issues using GE6.2U4 and OpenMPI-1.3.3 (with 
gridengine compnent).


During any job submission, SGE creates a session directory in $TMPDIR, 
named after the job id and the computing node name. This session 
directory is created using nobody/nogroup credentials.


When using OpenMPI with tight-integration, opal creates different 
subdirectories in this session directory. The issue I'm facing now is 
that OpenMPI fails to create these subdirectories:


[charlie:03882] opal_os_dirpath_create: Error: Unable to create the 
sub-directory (/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0) 
of (/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../openmpi-1.3.3/orte/util/session_dir.c at line 101
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../openmpi-1.3.3/orte/util/session_dir.c at line 425
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../../../../openmpi-1.3.3/orte/mca/ess/hnp/ess_hnp_module.c at line 273

--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

 orte_session_dir failed
 --> Returned value Error (-1) instead of ORTE_SUCCESS
--
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../openmpi-1.3.3/orte/runtime/orte_init.c at line 132

--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

 orte_ess_set_name failed
 --> Returned value Error (-1) instead of ORTE_SUCCESS
--
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../../../openmpi-1.3.3/orte/tools/orterun/orterun.c at line 473


This seems very likely related to the permissions set on $TMPDIR.

I'd like to know if someone might have experienced the same or a similar 
issue and if any solution was found.


Thanks for your help,
Eloi




--


Eloi Gaudry

Free Field Technologies
Axis Park Louvain-la-Neuve
Rue Emile Francqui, 1
B-1435 Mont-Saint Guibert
BELGIUM

Company Phone: +32 10 487 959
Company Fax:   +32 10 454 626

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Ralph Castain

Creating a directory with such credentials sounds like a bug in SGE to  
me...perhaps an SGE config issue?


Only thing you could do is tell OMPI to use some other directory as  
the root for its session dir tree - check "mpirun -h", or ompi_info  
for the required option.


But I would first check your SGE config as that just doesn't sound  
right.


On Nov 10, 2009, at 9:40 AM, Eloi Gaudry wrote:


Hi there,

I'm experiencing some issues using GE6.2U4 and OpenMPI-1.3.3 (with  
gridengine compnent).


During any job submission, SGE creates a session directory in  
$TMPDIR, named after the job id and the computing node name. This  
session directory is created using nobody/nogroup credentials.


When using OpenMPI with tight-integration, opal creates different  
subdirectories in this session directory. The issue I'm facing now  
is that OpenMPI fails to create these subdirectories:


[charlie:03882] opal_os_dirpath_create: Error: Unable to create the  
sub-directory (/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0) of (/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file ../../ 
openmpi-1.3.3/orte/util/session_dir.c at line 101
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file ../../ 
openmpi-1.3.3/orte/util/session_dir.c at line 425
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in  
file ../../../../../openmpi-1.3.3/orte/mca/ess/hnp/ess_hnp_module.c  
at line 273

--
It looks like orte_init failed for some reason; your parallel  
process is

likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

orte_session_dir failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
--
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file ../../ 
openmpi-1.3.3/orte/runtime/orte_init.c at line 132

--
It looks like orte_init failed for some reason; your parallel  
process is

likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

orte_ess_set_name failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
--
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in  
file ../../../../openmpi-1.3.3/orte/tools/orterun/orterun.c at line  
473


This seems very likely related to the permissions set on $TMPDIR.

I'd like to know if someone might have experienced the same or a  
similar issue and if any solution was found.


Thanks for your help,
Eloi




--


Eloi Gaudry

Free Field Technologies
Axis Park Louvain-la-Neuve
Rue Emile Francqui, 1
B-1435 Mont-Saint Guibert
BELGIUM

Company Phone: +32 10 487 959
Company Fax:   +32 10 454 626

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

[OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Chris Walker

Hello,

We've been having a lot of problems where openmpi jobs crash at
startup because the call to lsb_launch fails (we have a ticket open
with Platform about this).  Is there a way to disable the lsb_launch
startup mechanism at runtime and revert to ssh?  It's easy enough to
recompile without LSF support, but it'd be even easier to drop a
parameter in  openmpi-mca-params.conf.

Thanks!
Chris

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry


Thanks for your help Ralph, I'll double check that.

As for the error message received, there might be some inconsistency: 
"/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0" is the parent 
directory and 
"/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0/53199/0/0" is 
the subdirectory... not the other way around.


Eloi




Ralph Castain wrote:
Creating a directory with such credentials sounds like a bug in SGE to 
me...perhaps an SGE config issue?


Only thing you could do is tell OMPI to use some other directory as 
the root for its session dir tree - check "mpirun -h", or ompi_info 
for the required option.


But I would first check your SGE config as that just doesn't sound right.

On Nov 10, 2009, at 9:40 AM, Eloi Gaudry wrote:


Hi there,

I'm experiencing some issues using GE6.2U4 and OpenMPI-1.3.3 (with 
gridengine compnent).


During any job submission, SGE creates a session directory in 
$TMPDIR, named after the job id and the computing node name. This 
session directory is created using nobody/nogroup credentials.


When using OpenMPI with tight-integration, opal creates different 
subdirectories in this session directory. The issue I'm facing now is 
that OpenMPI fails to create these subdirectories:


[charlie:03882] opal_os_dirpath_create: Error: Unable to create the 
sub-directory 
(/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0) of 
(/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../openmpi-1.3.3/orte/util/session_dir.c at line 101
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../openmpi-1.3.3/orte/util/session_dir.c at line 425
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../../../../openmpi-1.3.3/orte/mca/ess/hnp/ess_hnp_module.c at 
line 273
-- 


It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

orte_session_dir failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
-- 

[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../openmpi-1.3.3/orte/runtime/orte_init.c at line 132
-- 


It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

orte_ess_set_name failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
-- 

[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../../../openmpi-1.3.3/orte/tools/orterun/orterun.c at line 473


This seems very likely related to the permissions set on $TMPDIR.

I'd like to know if someone might have experienced the same or a 
similar issue and if any solution was found.


Thanks for your help,
Eloi




--


Eloi Gaudry

Free Field Technologies
Axis Park Louvain-la-Neuve
Rue Emile Francqui, 1
B-1435 Mont-Saint Guibert
BELGIUM

Company Phone: +32 10 487 959
Company Fax:   +32 10 454 626

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--


Eloi Gaudry

Free Field Technologies
Axis Park Louvain-la-Neuve
Rue Emile Francqui, 1
B-1435 Mont-Saint Guibert
BELGIUM

Company Phone: +32 10 487 959
Company Fax:   +32 10 454 626

Re: [OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Ralph Castain


What version of OMPI?

On Nov 10, 2009, at 9:49 AM, Chris Walker wrote:


Hello,

We've been having a lot of problems where openmpi jobs crash at
startup because the call to lsb_launch fails (we have a ticket open
with Platform about this).  Is there a way to disable the lsb_launch
startup mechanism at runtime and revert to ssh?  It's easy enough to
recompile without LSF support, but it'd be even easier to drop a
parameter in  openmpi-mca-params.conf.

Thanks!
Chris
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti


Hi,

Am 10.11.2009 um 17:55 schrieb Eloi Gaudry:


Thanks for your help Ralph, I'll double check that.

As for the error message received, there might be some  
inconsistency: "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0" is the


often /opt/sge is shared across the nodes, while the /tmp (sometimes  
implemented as /scratch in a partition on its own) should be local on  
each node.


What is the setting of "tmpdir" in your queue definition?

If you want to share /opt/sge/tmp, everyone must be able to write  
into this location. As for me it's working fine (with the local / 
tmp), I assume the nobody/nogroup comes from any squash-setting in  
the /etc/export of you master node.


-- Reuti


parent directory and "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0/53199/0/0" is the subdirectory... not the other way  
around.


Eloi



Ralph Castain wrote:
Creating a directory with such credentials sounds like a bug in  
SGE to me...perhaps an SGE config issue?


Only thing you could do is tell OMPI to use some other directory  
as the root for its session dir tree - check "mpirun -h", or  
ompi_info for the required option.


But I would first check your SGE config as that just doesn't sound  
right.


On Nov 10, 2009, at 9:40 AM, Eloi Gaudry wrote:


Hi there,

I'm experiencing some issues using GE6.2U4 and OpenMPI-1.3.3  
(with gridengine compnent).


During any job submission, SGE creates a session directory in  
$TMPDIR, named after the job id and the computing node name. This  
session directory is created using nobody/nogroup credentials.


When using OpenMPI with tight-integration, opal creates different  
subdirectories in this session directory. The issue I'm facing  
now is that OpenMPI fails to create these subdirectories:


[charlie:03882] opal_os_dirpath_create: Error: Unable to create  
the sub-directory (/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0) of (/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file ../../ 
openmpi-1.3.3/orte/util/session_dir.c at line 101
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file ../../ 
openmpi-1.3.3/orte/util/session_dir.c at line 425
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in  
file ../../../../../openmpi-1.3.3/orte/mca/ess/hnp/ 
ess_hnp_module.c at line 273
 
--
It looks like orte_init failed for some reason; your parallel  
process is

likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal  
failure;

here's some additional information (which may only be relevant to an
Open MPI developer):

orte_session_dir failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
 
--
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file ../../ 
openmpi-1.3.3/orte/runtime/orte_init.c at line 132
 
--
It looks like orte_init failed for some reason; your parallel  
process is

likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal  
failure;

here's some additional information (which may only be relevant to an
Open MPI developer):

orte_ess_set_name failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
 
--
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in  
file ../../../../openmpi-1.3.3/orte/tools/orterun/orterun.c at  
line 473


This seems very likely related to the permissions set on $TMPDIR.

I'd like to know if someone might have experienced the same or a  
similar issue and if any solution was found.


Thanks for your help,
Eloi




--


Eloi Gaudry

Free Field Technologies
Axis Park Louvain-la-Neuve
Rue Emile Francqui, 1
B-1435 Mont-Saint Guibert
BELGIUM

Company Phone: +32 10 487 959
Company Fax:   +32 10 454 626

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--


Eloi Gaudry

Free Field Technologies
Axis Park Louvain-la-Neuve
Rue Emile Francqui, 1
B-1435 Mont-Saint Guibert
BELGIUM

Company Phone: +32 10 487 959
Company Fax:   +32 10 454 626

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Chris Walker

We have modules for both 1.3.2 and 1.3.3 (intel compilers)

Chris

On Tue, Nov 10, 2009 at 11:58 AM, Ralph Castain  wrote:
> What version of OMPI?
>
> On Nov 10, 2009, at 9:49 AM, Chris Walker wrote:
>
>> Hello,
>>
>> We've been having a lot of problems where openmpi jobs crash at
>> startup because the call to lsb_launch fails (we have a ticket open
>> with Platform about this).  Is there a way to disable the lsb_launch
>> startup mechanism at runtime and revert to ssh?  It's easy enough to
>> recompile without LSF support, but it'd be even easier to drop a
>> parameter in  openmpi-mca-params.conf.
>>
>> Thanks!
>> Chris
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported from the 
master node to all others computing nodes.
 with for /etc/export on server (named moe.fft):   /opt/sge
192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
   /etc/fstab on client:
moe.fft:/opt/sge
/opt/sgenfs rw,bg,soft,timeo=14, 0 0
Actually, the /opt/sge/tmp directory is 777 across all machines, thus 
all user should be able to create a directory inside.


The issue seems somehow related to the session directory created inside 
/opt/sge/tmp, let's stay /opt/sge/tmp/29.1.smp8.q for example for the 
job 29 on queue smp8.q. This subdirectory of /opt/sge/tmp is created 
with nobody:nogroup drwxr-xr-x permissions... which in turn forbids 
OpenMPI to create its subtree inside (as OpenMPI won't use 
nobody:nogroup credentials).


Ad Ralph suggested, I checked the SGE configuration, but I haven't found 
anything related to nobody:nogroup configuration so far.


Eloi


Reuti wrote:

Hi,

Am 10.11.2009 um 17:55 schrieb Eloi Gaudry:


Thanks for your help Ralph, I'll double check that.

As for the error message received, there might be some inconsistency: 
"/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0" is the


often /opt/sge is shared across the nodes, while the /tmp (sometimes 
implemented as /scratch in a partition on its own) should be local on 
each node.


What is the setting of "tmpdir" in your queue definition?

If you want to share /opt/sge/tmp, everyone must be able to write into 
this location. As for me it's working fine (with the local /tmp), I 
assume the nobody/nogroup comes from any squash-setting in the 
/etc/export of you master node.


-- Reuti


parent directory and 
"/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0/53199/0/0" is 
the subdirectory... not the other way around.


Eloi



Ralph Castain wrote:
Creating a directory with such credentials sounds like a bug in SGE 
to me...perhaps an SGE config issue?


Only thing you could do is tell OMPI to use some other directory as 
the root for its session dir tree - check "mpirun -h", or ompi_info 
for the required option.


But I would first check your SGE config as that just doesn't sound 
right.


On Nov 10, 2009, at 9:40 AM, Eloi Gaudry wrote:


Hi there,

I'm experiencing some issues using GE6.2U4 and OpenMPI-1.3.3 (with 
gridengine compnent).


During any job submission, SGE creates a session directory in 
$TMPDIR, named after the job id and the computing node name. This 
session directory is created using nobody/nogroup credentials.


When using OpenMPI with tight-integration, opal creates different 
subdirectories in this session directory. The issue I'm facing now 
is that OpenMPI fails to create these subdirectories:


[charlie:03882] opal_os_dirpath_create: Error: Unable to create the 
sub-directory 
(/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0) of 
(/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../openmpi-1.3.3/orte/util/session_dir.c at line 101
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../openmpi-1.3.3/orte/util/session_dir.c at line 425
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../../../../openmpi-1.3.3/orte/mca/ess/hnp/ess_hnp_module.c at 
line 273
-- 

It looks like orte_init failed for some reason; your parallel 
process is

likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

orte_session_dir failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
-- 

[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../openmpi-1.3.3/orte/runtime/orte_init.c at line 132
-- 

It looks like orte_init failed for some reason; your parallel 
process is

likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

orte_ess_set_name failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
-- 

[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file 
../../../../openmpi-1.3.3/orte/tools/orterun/orterun.c at line 473


This seems very likely related to the permissions set on $TMPDIR.

I'd like to know if someone might have expe

Re: [OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Ralph Castain


Just add

plm = rsh

to your default mca param file.

You don't need to reconfigure or rebuild OMPI

On Nov 10, 2009, at 10:16 AM, Chris Walker wrote:


We have modules for both 1.3.2 and 1.3.3 (intel compilers)

Chris

On Tue, Nov 10, 2009 at 11:58 AM, Ralph Castain   
wrote:

What version of OMPI?

On Nov 10, 2009, at 9:49 AM, Chris Walker wrote:


Hello,

We've been having a lot of problems where openmpi jobs crash at
startup because the call to lsb_launch fails (we have a ticket open
with Platform about this).  Is there a way to disable the lsb_launch
startup mechanism at runtime and revert to ssh?  It's easy enough to
recompile without LSF support, but it'd be even easier to drop a
parameter in  openmpi-mca-params.conf.

Thanks!
Chris
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Chris Walker

Perfect!  Thanks very much,
Chris

On Tue, Nov 10, 2009 at 12:22 PM, Ralph Castain  wrote:
> Just add
>
> plm = rsh
>
> to your default mca param file.
>
> You don't need to reconfigure or rebuild OMPI
>
> On Nov 10, 2009, at 10:16 AM, Chris Walker wrote:
>
>> We have modules for both 1.3.2 and 1.3.3 (intel compilers)
>>
>> Chris
>>
>> On Tue, Nov 10, 2009 at 11:58 AM, Ralph Castain  wrote:
>>>
>>> What version of OMPI?
>>>
>>> On Nov 10, 2009, at 9:49 AM, Chris Walker wrote:
>>>
 Hello,

 We've been having a lot of problems where openmpi jobs crash at
 startup because the call to lsb_launch fails (we have a ticket open
 with Platform about this).  Is there a way to disable the lsb_launch
 startup mechanism at runtime and revert to ssh?  It's easy enough to
 recompile without LSF support, but it'd be even easier to drop a
 parameter in  openmpi-mca-params.conf.

 Thanks!
 Chris
 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] ipo: warning #11009: file format not recognized for /Libraries_intel/openmpi/lib/libmpi.so

2009-11-10 Thread Nifty Tom Mitchell

On Tue, Nov 10, 2009 at 03:44:59PM +0200, vasilis gkanis wrote:
> 
> I am trying to compile openmpi-1.3.3 with intel Fortran and gcc compiler.
> 
> In order to compile openmpi I run configure with the following options:
> 
> ./configure --prefix=/Libraries/openmpi FC=ifort --enable-mpi-f90
> 
> OpenMpi compiled just fine, but when I am trying to compile and link my 
> program 
> against mpi, I get the following error:
> 
> ipo: warning #11009: file format not recognized for 
> /Libraries_intel/openmpi/lib/libmpi.so
> ld: skipping incompatible /Libraries_intel/openmpi/lib/libmpi.so when 
> searching for -lmpi
> ld: cannot find -lmpi
> 
> I have updated the LD_LIBRARY_PATH file.
> 
> Does anybody know what this error mean?

What does:
 file /Libraries_intel/openmpi/lib/libmpi.so
tell you?

Perhaps this is a 32bit .vs. 64bit mismatch?







-- 
T o m  M i t c h e l l 
Found me a new hat, now what?

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti


Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported from the  
master node to all others computing nodes.


It's higly advisable to have the "tmpdir" local on each node. When  
you use "cd $TMPDIR" in your jobscript, all is done local on a node  
(when your application will just create the scratch file in your  
current working directory) which will speed up the computation and  
decrease the network traffic. Computing in as shared /opt/sge/tmp is  
like computing in each user's home directory.


To avoid that any user can remove someone else's files, the "t" flag  
is set like for /tmp: drwxrwxrwt 14 root root 4096 2009-11-10 18:35 / 
tmp/


Nevertheless:

 with for /etc/export on server (named moe.fft):   /opt/sge 
192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
   /etc/fstab on  
client:moe.fft:/opt/ 
sge/opt/sge 
nfs rw,bg,soft,timeo=14, 0 0
Actually, the /opt/sge/tmp directory is 777 across all machines,  
thus all user should be able to create a directory inside.


All access checkings will be applied:

- on the server: what is "ls -d /opt/sge/tmp" showing?
- the one from the export (this seems to be fine)
- the one on the node (i.e., how it's mounted: cat /etc/fstab)

The issue seems somehow related to the session directory created  
inside /opt/sge/tmp, let's stay /opt/sge/tmp/29.1.smp8.q for  
example for the job 29 on queue smp8.q. This subdirectory of /opt/ 
sge/tmp is created with nobody:nogroup drwxr-xr-x permissions...  
which in turn forbids


Did you try to run some simple jobs before the parallel ones - are  
these working? The daemons (qmaster and execd) were started as root?


The user is known on the file server, i.e. the machine hosting /opt/sge?

OpenMPI to create its subtree inside (as OpenMPI won't use  
nobody:nogroup credentials).


In SGE the master process (the one running the job script) will  
create the /opt/sge/tmp/29.1.smp8.q  and also each started qrsh  
inside SGE - all with the same name. What is your definition of the  
PE in SGE which you use?


-- Reuti


Ad Ralph suggested, I checked the SGE configuration, but I haven't  
found anything related to nobody:nogroup configuration so far.


Eloi


Reuti wrote:

Hi,

Am 10.11.2009 um 17:55 schrieb Eloi Gaudry:


Thanks for your help Ralph, I'll double check that.

As for the error message received, there might be some  
inconsistency: "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0" is the


often /opt/sge is shared across the nodes, while the /tmp  
(sometimes implemented as /scratch in a partition on its own)  
should be local on each node.


What is the setting of "tmpdir" in your queue definition?

If you want to share /opt/sge/tmp, everyone must be able to write  
into this location. As for me it's working fine (with the local / 
tmp), I assume the nobody/nogroup comes from any squash-setting in  
the /etc/export of you master node.


-- Reuti


parent directory and "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0/53199/0/0" is the subdirectory... not the other way  
around.


Eloi



Ralph Castain wrote:
Creating a directory with such credentials sounds like a bug in  
SGE to me...perhaps an SGE config issue?


Only thing you could do is tell OMPI to use some other directory  
as the root for its session dir tree - check "mpirun -h", or  
ompi_info for the required option.


But I would first check your SGE config as that just doesn't  
sound right.


On Nov 10, 2009, at 9:40 AM, Eloi Gaudry wrote:


Hi there,

I'm experiencing some issues using GE6.2U4 and OpenMPI-1.3.3  
(with gridengine compnent).


During any job submission, SGE creates a session directory in  
$TMPDIR, named after the job id and the computing node name.  
This session directory is created using nobody/nogroup  
credentials.


When using OpenMPI with tight-integration, opal creates  
different subdirectories in this session directory. The issue  
I'm facing now is that OpenMPI fails to create these  
subdirectories:


[charlie:03882] opal_os_dirpath_create: Error: Unable to create  
the sub-directory (/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0) of (/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in  
file ../../openmpi-1.3.3/orte/util/session_dir.c at line 101
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in  
file ../../openmpi-1.3.3/orte/util/session_dir.c at line 425
[charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in  
file ../../../../../openmpi-1.3.3/orte/mca/ess/hnp/ 
ess_hnp_module.c at line 273
-- 

It looks like orte_init failed for some reason; your parallel  
process is
likely to abort.  There are many reasons that a parallel  
process can

fail during orte_init; some

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry


Reuti,

I'm using "tmpdir" as a shared directory that contains the session 
directories created during job submission, not for computing or local 
storage. Doesn't the session directory (i.e. job_id.queue_name) need to 
be shared among all computing nodes (at least the ones that would be 
used with orted during the parallel computation) ?



All sequential job run fine, as no write operation is performed in 
"tmpdir/session_directory".


All users are known on the computing nodes and the master node (with use 
ldap authentication on all nodes).


As for the access checkings:
moe:~# ls -alrtd /opt/sge/tmp
drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp

And for the parallel environment configuration:
moe:~# qconf -sp round_robin
pe_nameround_robin
slots  32
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$round_robin
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE

Thanks for your help,
Eloi

Reuti wrote:

Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported from the 
master node to all others computing nodes.


It's higly advisable to have the "tmpdir" local on each node. When you 
use "cd $TMPDIR" in your jobscript, all is done local on a node (when 
your application will just create the scratch file in your current 
working directory) which will speed up the computation and decrease 
the network traffic. Computing in as shared /opt/sge/tmp is like 
computing in each user's home directory.


To avoid that any user can remove someone else's files, the "t" flag 
is set like for /tmp: drwxrwxrwt 14 root root 4096 2009-11-10 18:35 /tmp/


Nevertheless:

 with for /etc/export on server (named moe.fft):   /opt/sge
192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
   /etc/fstab on 
client:
moe.fft:/opt/sge
/opt/sgenfs rw,bg,soft,timeo=14, 0 0
Actually, the /opt/sge/tmp directory is 777 across all machines, thus 
all user should be able to create a directory inside.


All access checkings will be applied:

- on the server: what is "ls -d /opt/sge/tmp" showing?
- the one from the export (this seems to be fine)
- the one on the node (i.e., how it's mounted: cat /etc/fstab)

The issue seems somehow related to the session directory created 
inside /opt/sge/tmp, let's stay /opt/sge/tmp/29.1.smp8.q for example 
for the job 29 on queue smp8.q. This subdirectory of /opt/sge/tmp is 
created with nobody:nogroup drwxr-xr-x permissions... which in turn 
forbids


Did you try to run some simple jobs before the parallel ones - are 
these working? The daemons (qmaster and execd) were started as root?


The user is known on the file server, i.e. the machine hosting /opt/sge?

OpenMPI to create its subtree inside (as OpenMPI won't use 
nobody:nogroup credentials).


In SGE the master process (the one running the job script) will create 
the /opt/sge/tmp/29.1.smp8.q  and also each started qrsh inside SGE - 
all with the same name. What is your definition of the PE in SGE which 
you use?


-- Reuti


Ad Ralph suggested, I checked the SGE configuration, but I haven't 
found anything related to nobody:nogroup configuration so far.


Eloi


Reuti wrote:

Hi,

Am 10.11.2009 um 17:55 schrieb Eloi Gaudry:


Thanks for your help Ralph, I'll double check that.

As for the error message received, there might be some 
inconsistency: 
"/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0" is the


often /opt/sge is shared across the nodes, while the /tmp (sometimes 
implemented as /scratch in a partition on its own) should be local 
on each node.


What is the setting of "tmpdir" in your queue definition?

If you want to share /opt/sge/tmp, everyone must be able to write 
into this location. As for me it's working fine (with the local 
/tmp), I assume the nobody/nogroup comes from any squash-setting in 
the /etc/export of you master node.


-- Reuti


parent directory and 
"/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0/53199/0/0" 
is the subdirectory... not the other way around.


Eloi



Ralph Castain wrote:
Creating a directory with such credentials sounds like a bug in 
SGE to me...perhaps an SGE config issue?


Only thing you could do is tell OMPI to use some other directory 
as the root for its session dir tree - check "mpirun -h", or 
ompi_info for the required option.


But I would first check your SGE config as that just doesn't sound 
right.


On Nov 10, 2009, at 9:40 AM, Eloi Gaudry wrote:


Hi there,

I'm experiencing some issues using GE6.2U4 and OpenMPI-1.3.3 
(with gridengine compnent).


During any job submission, SGE creates a session directory in 
$TMPDIR, named after the job id and the computing node name. This 
session directory is created using nobody/nogroup

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti


Hi,

Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:


Reuti,

I'm using "tmpdir" as a shared directory that contains the session  
directories created during job submission, not for computing or  
local storage. Doesn't the session directory (i.e.  
job_id.queue_name) need to be shared among all computing nodes (at  
least the ones that would be used with orted during the parallel  
computation) ?


no. orted runs happily with local $TMPDIR on each and every node. The  
$TMPDIRs are intended to be used by the user for any temporary data  
for his job, as they are created and removed by SGE automatically for  
every job for his convenience.



All sequential job run fine, as no write operation is performed in  
"tmpdir/session_directory".


All users are known on the computing nodes and the master node  
(with use ldap authentication on all nodes).


As for the access checkings:
moe:~# ls -alrtd /opt/sge/tmp
drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp


Aha, the + tells that there are some ACLs set:

getfacl /opt/sge/tmp



And for the parallel environment configuration:
moe:~# qconf -sp round_robin
pe_nameround_robin
slots  32
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$round_robin
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE


Okay, fine.

-- Reuti



Thanks for your help,
Eloi

Reuti wrote:

Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported from  
the master node to all others computing nodes.


It's higly advisable to have the "tmpdir" local on each node. When  
you use "cd $TMPDIR" in your jobscript, all is done local on a  
node (when your application will just create the scratch file in  
your current working directory) which will speed up the  
computation and decrease the network traffic. Computing in as  
shared /opt/sge/tmp is like computing in each user's home directory.


To avoid that any user can remove someone else's files, the "t"  
flag is set like for /tmp: drwxrwxrwt 14 root root 4096 2009-11-10  
18:35 /tmp/


Nevertheless:

 with for /etc/export on server (named moe.fft):   /opt/sge 
192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
   /etc/fstab on  
client:moe.fft:/opt/ 
sge/opt/ 
sgenfs rw,bg,soft,timeo=14, 0 0
Actually, the /opt/sge/tmp directory is 777 across all machines,  
thus all user should be able to create a directory inside.


All access checkings will be applied:

- on the server: what is "ls -d /opt/sge/tmp" showing?
- the one from the export (this seems to be fine)
- the one on the node (i.e., how it's mounted: cat /etc/fstab)

The issue seems somehow related to the session directory created  
inside /opt/sge/tmp, let's stay /opt/sge/tmp/29.1.smp8.q for  
example for the job 29 on queue smp8.q. This subdirectory of /opt/ 
sge/tmp is created with nobody:nogroup drwxr-xr-x permissions...  
which in turn forbids


Did you try to run some simple jobs before the parallel ones - are  
these working? The daemons (qmaster and execd) were started as root?


The user is known on the file server, i.e. the machine hosting / 
opt/sge?


OpenMPI to create its subtree inside (as OpenMPI won't use  
nobody:nogroup credentials).


In SGE the master process (the one running the job script) will  
create the /opt/sge/tmp/29.1.smp8.q  and also each started qrsh  
inside SGE - all with the same name. What is your definition of  
the PE in SGE which you use?


-- Reuti


Ad Ralph suggested, I checked the SGE configuration, but I  
haven't found anything related to nobody:nogroup configuration so  
far.


Eloi


Reuti wrote:

Hi,

Am 10.11.2009 um 17:55 schrieb Eloi Gaudry:


Thanks for your help Ralph, I'll double check that.

As for the error message received, there might be some  
inconsistency: "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0" is the


often /opt/sge is shared across the nodes, while the /tmp  
(sometimes implemented as /scratch in a partition on its own)  
should be local on each node.


What is the setting of "tmpdir" in your queue definition?

If you want to share /opt/sge/tmp, everyone must be able to  
write into this location. As for me it's working fine (with the  
local /tmp), I assume the nobody/nogroup comes from any squash- 
setting in the /etc/export of you master node.


-- Reuti


parent directory and "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- 
eg@charlie_0/53199/0/0" is the subdirectory... not the other  
way around.


Eloi



Ralph Castain wrote:
Creating a directory with such credentials sounds like a bug  
in SGE to me...perhaps an SGE config issue?


Only thing you could do is tell OMPI to use some other  
directory as the root for its session dir tree - check "mpirun  
-h", or o

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry


Reuti,

The acl here were just added when I tried to force the /opt/sge/tmp 
subdirectories to be 777 (which I did when I first encountered the error 
of subdirectories creation within OpenMPI). I don't think the info I'll 
provide will be meaningfull here:


moe:~# getfacl /opt/sge/tmp
getfacl: Removing leading '/' from absolute path names
# file: opt/sge/tmp
# owner: sgeadmin
# group: fft
user::rwx
group::rwx
mask::rwx
other::rwx
default:user::rwx
default:group::rwx
default:group:fft:rwx
default:mask::rwx
default:other::rwx

I'll try to use a local directory instead of a shared one for "tmpdir". 
But as this issue seems somehow related to permissions, I don't know if 
this would eventually be the rigth solution.


Thanks for your help,
Eloi

Reuti wrote:

Hi,

Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:


Reuti,

I'm using "tmpdir" as a shared directory that contains the session 
directories created during job submission, not for computing or local 
storage. Doesn't the session directory (i.e. job_id.queue_name) need 
to be shared among all computing nodes (at least the ones that would 
be used with orted during the parallel computation) ?


no. orted runs happily with local $TMPDIR on each and every node. The 
$TMPDIRs are intended to be used by the user for any temporary data 
for his job, as they are created and removed by SGE automatically for 
every job for his convenience.



All sequential job run fine, as no write operation is performed in 
"tmpdir/session_directory".


All users are known on the computing nodes and the master node (with 
use ldap authentication on all nodes).


As for the access checkings:
moe:~# ls -alrtd /opt/sge/tmp
drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp


Aha, the + tells that there are some ACLs set:

getfacl /opt/sge/tmp



And for the parallel environment configuration:
moe:~# qconf -sp round_robin
pe_nameround_robin
slots  32
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$round_robin
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE


Okay, fine.

-- Reuti



Thanks for your help,
Eloi

Reuti wrote:

Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported from the 
master node to all others computing nodes.


It's higly advisable to have the "tmpdir" local on each node. When 
you use "cd $TMPDIR" in your jobscript, all is done local on a node 
(when your application will just create the scratch file in your 
current working directory) which will speed up the computation and 
decrease the network traffic. Computing in as shared /opt/sge/tmp is 
like computing in each user's home directory.


To avoid that any user can remove someone else's files, the "t" flag 
is set like for /tmp: drwxrwxrwt 14 root root 4096 2009-11-10 18:35 
/tmp/


Nevertheless:

 with for /etc/export on server (named moe.fft):   /opt/sge
192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
   /etc/fstab on 
client:
moe.fft:/opt/sge
/opt/sgenfs 
rw,bg,soft,timeo=14, 0 0
Actually, the /opt/sge/tmp directory is 777 across all machines, 
thus all user should be able to create a directory inside.


All access checkings will be applied:

- on the server: what is "ls -d /opt/sge/tmp" showing?
- the one from the export (this seems to be fine)
- the one on the node (i.e., how it's mounted: cat /etc/fstab)

The issue seems somehow related to the session directory created 
inside /opt/sge/tmp, let's stay /opt/sge/tmp/29.1.smp8.q for 
example for the job 29 on queue smp8.q. This subdirectory of 
/opt/sge/tmp is created with nobody:nogroup drwxr-xr-x 
permissions... which in turn forbids


Did you try to run some simple jobs before the parallel ones - are 
these working? The daemons (qmaster and execd) were started as root?


The user is known on the file server, i.e. the machine hosting 
/opt/sge?


OpenMPI to create its subtree inside (as OpenMPI won't use 
nobody:nogroup credentials).


In SGE the master process (the one running the job script) will 
create the /opt/sge/tmp/29.1.smp8.q  and also each started qrsh 
inside SGE - all with the same name. What is your definition of the 
PE in SGE which you use?


-- Reuti


Ad Ralph suggested, I checked the SGE configuration, but I haven't 
found anything related to nobody:nogroup configuration so far.


Eloi


Reuti wrote:

Hi,

Am 10.11.2009 um 17:55 schrieb Eloi Gaudry:


Thanks for your help Ralph, I'll double check that.

As for the error message received, there might be some 
inconsistency: 
"/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0" is the


often /opt/sge is shared across the nodes, while the /tmp 
(sometimes implemented as /scratch in a partition on its own) 
should be local on each node

[OMPI users] Coding help requested

2009-11-10 Thread amjad ali

Hi all.
(sorry for duplication, if it is)

I have to parallelize a CFD code using domain/grid/mesh partitioning among
the processes. Before running, we do not know,
(i) How many processes we will use ( np  is unknown)
(ii) A process will have how many neighbouring processes (my_nbrs = ?)
(iii) How many entries a process need to send to a particular neighbouring
process.
But when the code run, I calculate all of this info easily.


The problem is to copy a number of entries to an array then send that array
to a destination process. The same sender has to repeat this work to send
data to all of its neighbouring processes. Is this following code fine:

DO i = 1, my_nbrs
   DO j = 1, few_entries_for_this_neighbour
   send_array(j)   =my_array(jth_particular_entry)
   ENDDO
   CALL MPI_ISEND(send_array(1:j),j, MPI_REAL8, dest(i), tag,
MPI_COMM_WORLD, request1(i), ierr)
ENDDO

And the corresponding receives, at each process:

DO i = 1, my_nbrs
   k = few_entries_from_this_neighbour
   CALL MPI_IRECV(recv_array(1:k),k, MPI_REAL8, source(i), tag,
MPI_COMM_WORLD, request2(i), ierr)
   DO j = 1, few_from_source(i)
   received_data(j)   =recv_array(j)
   ENDDO
ENDDO

After the above MPI_WAITALL.


I think this code will not work. Both for sending and receiving. For the
non-blocking sends we cannot use send_array to send data to other processes
like above (as we are not sure for the availability of application buffer
for reuse). Am I right?

Similar problem is with recv array; data from multiple processes cannot be
received in the same array like above. Am I right?


Target is to hide communication behind computation. So need non blocking
communication. As we do know value of np or values of my_nbrs for each
process, we cannot decide to create so many arrays. Please suggest solution.


===
A more subtle solution that I could assume is following:

cc = 0
DO i = 1, my_nbrs
   DO j = 1, few_entries_for_this_neighbour
   send_array(cc+j)   =my_array(jth_particular_entry)
   ENDDO
   CALL MPI_ISEND(send_array(cc:cc+j),j, MPI_REAL8, dest(i), tag,
MPI_COMM_WORLD, request1(i), ierr)
   cc = cc  + j
ENDDO

And the corresponding receives, at each process:

cc = 0
DO i = 1, my_nbrs
   k = few_entries_from_this_neighbour
   CALL MPI_IRECV(recv_array(cc+1:cc+k),k, MPI_REAL8, source(i), tag,
MPI_COMM_WORLD, request2(i), ierr)
   DO j = 1, k
   received_data(j)   =recv_array(cc+j)
   ENDDO
   cc = cc + k
ENDDO

After the above MPI_WAITALL.

Means that,
send_array for all neighbours will have a collected shape:
send_array = [... entries for nbr 1 ..., ... entries for nbr 1 ..., ..., ...
entries for last nbr ...]
And the respective entries will be send to respective neighbours as above.


recv_array for all neighbours will have a collected shape:
recv_array = [... entries from nbr 1 ..., ... entries from nbr 1 ..., ...,
... entries from last nbr ...]
And the entries from the processes will be received at respective
locations/portion in the recv_array.


Is this scheme is quite fine and correct.

I am in search of efficient one.

Request for help.


With best regards,
Amjad Ali.

[OMPI users] How do you get static linkage for Intel compiler libs for the orterun executable?

2009-11-10 Thread Blosch, Edwin L

I'm trying to build OpenMPI with Intel compilers, both static and dynamic libs, 
then move it to a system that does not have Intel compilers.  I don't care 
about system libraries or OpenMPI loadable modules being dynamic, that's all 
fine.  But I need the compiler libs to be statically linked into any executable.

I don't seem to be smart enough to figure out how to get the Intel libs 
statically linked into the "orterun" command.

Can someone help suggest the right way to achieve this?

Here's my configure command and the relevant output from the "make" inside 
tools/orterun.  Notice that I am passing -i-static in LDFLAGS, and it does 
indeed appear to have made it into the link line for orterun.  It just didn't 
have the desired effect.  A subsequent 'ldd' shows that there is still a 
dependency on the libimf.so.

Thanks

./configure
--prefix=/release/cfd/openmpi-intel
--enable-mpirun-prefix-by-default
--enable-contrib-no-build=vt
--disable-per-user-config-files
--enable-mca-no-build=maffinity
--enable-static
--without-openib
--without-tm
--with-mpi-f90-size=small
CXX=/appserv/intel/cce/10.1.021/bin/icpc
CC=/appserv/intel/cce/10.1.021/bin/icc
'CFLAGS=  -O2'
'CXXFLAGS=  -O2'
F77=/appserv/intel/fce/10.1.021/bin/ifort
'FFLAGS=-D_GNU_SOURCE -fpe0 -no-ftz -traceback  -O2'
FC=/appserv/intel/fce/10.1.021/bin/ifort
'FCFLAGS=-D_GNU_SOURCE -fpe0 -no-ftz -traceback  -O2'
'LDFLAGS= -i-static'



make[2]: Entering directory `/home/bloscel/builds/openmpi/orte/tools/orterun'
depbase=`echo main.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
/appserv/intel/cce/10.1.021/bin/icc -DHAVE_CONFIG_H -I. -I../../../opal/include 
-I../../../orte/include -I../../../ompi/include 
-I../../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../..-DNDEBUG 
-finline-functions -fno-strict-aliasing -restrict -pthread -fvisibility=hidden 
-g -MT main.o -MD -MP -MF $depbase.Tpo -c -o main.o main.c &&\
mv -f $depbase.Tpo $depbase.Po
depbase=`echo orterun.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
/appserv/intel/cce/10.1.021/bin/icc -DHAVE_CONFIG_H -I. -I../../../opal/include 
-I../../../orte/include -I../../../ompi/include 
-I../../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../..-DNDEBUG 
-finline-functions -fno-strict-aliasing -restrict -pthread -fvisibility=hidden 
-g -MT orterun.o -MD -MP -MF $depbase.Tpo -c -o orterun.o orterun.c &&\
mv -f $depbase.Tpo $depbase.Po
depbase=`echo debuggers.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
/appserv/intel/cce/10.1.021/bin/icc -DHAVE_CONFIG_H -I. -I../../../opal/include 
-I../../../orte/include -I../../../ompi/include 
-I../../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../..-DNDEBUG 
-finline-functions -fno-strict-aliasing -restrict -pthread -fvisibility=hidden 
-g -MT debuggers.o -MD -MP -MF $depbase.Tpo -c -o debuggers.o debuggers.c &&\
mv -f $depbase.Tpo $depbase.Po
/bin/sh ../../../libtool --tag=CC   --mode=link 
/appserv/intel/cce/10.1.021/bin/icc  -DNDEBUG -finline-functions 
-fno-strict-aliasing -restrict -pthread -fvisibility=hidden -g  -export-dynamic 
 -i-static  -o orterun main.o orterun.o debuggers.o 
../../../orte/libopen-rte.la -lnsl -lutil
libtool: link: /appserv/intel/cce/10.1.021/bin/icc -DNDEBUG -finline-functions 
-fno-strict-aliasing -restrict -pthread -fvisibility=hidden -g -i-static -o 
.libs/orterun main.o orterun.o debuggers.o -Wl,--export-dynamic  
../../../orte/.libs/libopen-rte.so 
/home/bloscel/builds/openmpi/opal/.libs/libopen-pal.so -ldl -lnsl -lutil 
-pthread -Wl,-rpath -Wl,/release/cfd/openmpi-intel/lib
/appserv/intel/cce/10.1.021/lib/libimf.so: warning: warning: feupdateenv is not 
implemented and will always fail

Re: [OMPI users] Coding help requested

2009-11-10 Thread Eugene Loh


amjad ali wrote:


Hi all.
(sorry for duplication, if it is)

I have to parallelize a CFD code using domain/grid/mesh partitioning 
among the processes. Before running, we do not know,

(i) How many processes we will use ( np  is unknown)
(ii) A process will have how many neighbouring processes (my_nbrs = ?)
(iii) How many entries a process need to send to a particular 
neighbouring process.

But when the code run, I calculate all of this info easily.


The problem is to copy a number of entries to an array then send that 
array to a destination process. The same sender has to repeat this 
work to send data to all of its neighbouring processes. Is this 
following code fine:


DO i = 1, my_nbrs
   DO j = 1, few_entries_for_this_neighbour
   send_array(j)   =my_array(jth_particular_entry)
   ENDDO
   CALL MPI_ISEND(send_array(1:j),j, MPI_REAL8, dest(i), tag, 
MPI_COMM_WORLD, request1(i), ierr)


instead of "j" I assume you intended something like 
"few_entries_for_this_neighbour"



ENDDO

And the corresponding receives, at each process:

DO i = 1, my_nbrs
   k = few_entries_from_this_neighbour
   CALL MPI_IRECV(recv_array(1:k),k, MPI_REAL8, source(i), tag, 
MPI_COMM_WORLD, request2(i), ierr)

   DO j = 1, few_from_source(i)
   received_data(j)   =recv_array(j)
   ENDDO
ENDDO

After the above MPI_WAITALL.


I think this code will not work. Both for sending and receiving. For 
the non-blocking sends we cannot use send_array to send data to other 
processes like above (as we are not sure for the availability of 
application buffer for reuse). Am I right?


Similar problem is with recv array; data from multiple processes 
cannot be received in the same array like above. Am I right?


Correct for both send and receive.  When you call MPI_Isend, the buffer 
cannot be written until the MPI_Waitall.  When you use MPI_Irecv, you 
cannot read the data until MPI_Waitall.  You're reusing both send and 
receive buffers too often and too soon.


Target is to hide communication behind computation. So need non 
blocking communication. As we do know value of np or values of my_nbrs 
for each process, we cannot decide to create so many arrays. Please 
suggest solution.


You can allocate memory dynamically, even in Fortran.


A more subtle solution that I could assume is following:

cc = 0
DO i = 1, my_nbrs
   DO j = 1, few_entries_for_this_neighbour
   send_array(cc+j)   =my_array(jth_particular_entry)
   ENDDO
   CALL MPI_ISEND(send_array(cc:cc+j),j, MPI_REAL8, dest(i), tag, 
MPI_COMM_WORLD, request1(i), ierr)

   cc = cc  + j
ENDDO


Same issue with j as before, but yes concatenating the various send 
buffers in a one-dimensional fashion should work.



And the corresponding receives, at each process:

cc = 0
DO i = 1, my_nbrs
   k = few_entries_from_this_neighbour
   CALL MPI_IRECV(recv_array(cc+1:cc+k),k, MPI_REAL8, source(i), tag, 
MPI_COMM_WORLD, request2(i), ierr)

   DO j = 1, k
   received_data(j)   =recv_array(cc+j)
   ENDDO
   cc = cc + k
ENDDO


Okay, but you're still reading the data before the MPI_Waitall call.  If 
you call MPI_Irecv(buffer,...), you cannot read the buffer's contents 
until the corresponding MPI_Waitall (or variant).



After the above MPI_WAITALL.

Means that,
send_array for all neighbours will have a collected shape:
send_array = [... entries for nbr 1 ..., ... entries for nbr 1 ..., 
..., ... entries for last nbr ...]
And the respective entries will be send to respective neighbours as 
above.



recv_array for all neighbours will have a collected shape:
recv_array = [... entries from nbr 1 ..., ... entries from nbr 1 ..., 
..., ... entries from last nbr ...]
And the entries from the processes will be received at respective 
locations/portion in the recv_array.



Is this scheme is quite fine and correct.

I am in search of efficient one.

[OMPI users] Problem with mpirun -preload-binary option

2009-11-10 Thread Qing Pang


I'm having problem getting the mpirun "preload-binary" option to work.

I'm using ubutu8.10 with openmpi 1.3.3, nodes connected with Ethernet cable.
If I copy the executable to client nodes using scp, then do mpirun, 
everything works.


But I really want to avoid the copying, so I tried the -preload-binary 
option.


When I typed the command on my master node as below (gordon-desktop is 
my master node, and gordon-laptop is the client node):


--
gordon_at_gordon-desktop:~/Desktop/openmpi-1.3.3/examples$  mpirun
-machinefile machine.linux -np 2 --preload-binary $(pwd)/hello_c.out
--

I got the following:

gordon_at_gordon-desktop's password:  (I entered my password here, why 
am I asked for the password? I am working under this account anyway)



WARNING: Remote peer ([[18118,0],1]) failed to preload a file.

Exit Status: 256
Local  File: 
/tmp/openmpi-sessions-gordon_at_gordon-laptop_0/18118/0/hello_c.out

Remote File: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
Command:
 scp  
gordon-desktop:/home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out

/tmp/openmpi-sessions-gordon_at_gordon-laptop_0/18118/0/hello_c.out

Will continue attempting to launch the process(es).
--
--
mpirun was unable to launch the specified application as it could not 
access

or execute an executable:

Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
Node: node1

while attempting to start process rank 1.
--

Had anyone succeeded with the 'preload-binary' option with the similar 
settings? I assume this mpirun option should work when compiling openmpi 
with default  options? Anything I need to set?


--qing

Re: [OMPI users] Problem with mpirun -preload-binary option

2009-11-10 Thread Ralph Castain

It -should- work, but you need password-less ssh setup. See our FAQ  
for how to do that, if you are unfamiliar with it.



On Nov 10, 2009, at 2:02 PM, Qing Pang wrote:


I'm having problem getting the mpirun "preload-binary" option to work.

I'm using ubutu8.10 with openmpi 1.3.3, nodes connected with  
Ethernet cable.
If I copy the executable to client nodes using scp, then do mpirun,  
everything works.


But I really want to avoid the copying, so I tried the -preload- 
binary option.


When I typed the command on my master node as below (gordon-desktop  
is my master node, and gordon-laptop is the client node):


--
gordon_at_gordon-desktop:~/Desktop/openmpi-1.3.3/examples$  mpirun
-machinefile machine.linux -np 2 --preload-binary $(pwd)/hello_c.out
--

I got the following:

gordon_at_gordon-desktop's password:  (I entered my password here,  
why am I asked for the password? I am working under this account  
anyway)



WARNING: Remote peer ([[18118,0],1]) failed to preload a file.

Exit Status: 256
Local  File: /tmp/openmpi-sessions-gordon_at_gordon-laptop_0/18118/0/ 
hello_c.out

Remote File: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
Command:
scp  gordon-desktop:/home/gordon/Desktop/openmpi-1.3.3/examples/ 
hello_c.out

/tmp/openmpi-sessions-gordon_at_gordon-laptop_0/18118/0/hello_c.out

Will continue attempting to launch the process(es).
--
--
mpirun was unable to launch the specified application as it could  
not access

or execute an executable:

Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
Node: node1

while attempting to start process rank 1.
--

Had anyone succeeded with the 'preload-binary' option with the  
similar settings? I assume this mpirun option should work when  
compiling openmpi with default  options? Anything I need to set?


--qing

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

[OMPI users] System hang-up on MPI_Reduce

2009-11-10 Thread Glembek Ondřej


Hi,

I am using MPI_Reduce operation on 122880x400 matrix of doubles. The  
parallel job runs on 32 machines, each having different processor in  
terms of speed, but the architecture and OS is the same on all  
machines (x86_64). The task is a typical map-and-reduce, i.e. each of  
the processes collects some data, which is then summed (MPI_Reduce w.  
MPI_SUM).


Having different processors, each of the jobs comes to the MPI_Reduce  
in different time.


The *first problem* came when I called MPI_Reduce on the whole matrix.  
The system ended up with *MPI_ERR_OTHER error*, each time on different  
rank. I fixed this problem by chunking up the matrix into 2048  
submatrices, calling MPI_Reduce in cycle.


However *second problem* arose --- MPI_Reduce hangs up... It  
apparently gets stuck in some kind of dead-lock or something like  
that. It seems that if the processors are of similar speed, the  
problem disappears, however I cannot provide this condition all the  
time.


I managed to get rid of the problem (at least after few  
non-problematic iterations) by sticking MPI_Barrier before the  
MPI_Reduce line.


The questions are:

1) is this a usual behavior???
2) is there some kind of timeout for MPI_Reduce???
3) why does MPI_Reduce die on large amount of data if the system has  
enough address space (64 bit compilation)


Thanx
Ondrej Glembek


--
  Ondrej Glembek, PhD student  E-mail: glem...@fit.vutbr.cz
  UPGM FIT VUT Brno, L226  Web:http://www.fit.vutbr.cz/~glembek
  Bozetechova 2, 612 66Phone:  +420 54114-1292
  Brno, Czech Republic Fax:+420 54114-1290

  ICQ: 93233896
  GPG: C050 A6DC 7291 6776 9B69 BB11 C033 D756 6F33 DE3C

[OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Tom Rosmond

I want to run a number of MPI executables simultaneously in a PBS job.
For example on my system I do 'cat $PBS_NODEFILE' and get a list like
this:

n04
n04
n04
n04
n06
n06
n06
n06
n07
n07
n07
n07
n09
n09
n09
n09

i.e, 16 processors on 4 nodes. from which I can parse into file(s) as
desired.  If I want to run prog1 on 1 node (4 processors), prog2 on 1
node (4 processors), and prog3 on 2 nodes (8 processors), I think the
syntax will be something like:

  mpirun -np 4 --hostfile nodefile1 prog1: \
 -np 4 --hostfile nodefile2 prog2: \
 -np 8 --hostfile nodefile3 prog3

Where nodefile1, nodefile2, and nodefile3 are the lists extracted from
PBS_NODEFILE.  Is this correct?  Any suggestion/advice, (e.g. syntax of
the 'nodefiles'), is appreciated.

T. Rosmond

Re: [OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Ralph Castain


What version are you trying to do this with?

Reason I ask: in 1.3.x, we introduced relative node syntax for  
specifying hosts to use. This would eliminate the need to create the  
hostfiles.


You might do a "man orte_hosts" (assuming you installed the man pages)  
and see what it says.


Ralph

On Nov 10, 2009, at 2:46 PM, Tom Rosmond wrote:


I want to run a number of MPI executables simultaneously in a PBS job.
For example on my system I do 'cat $PBS_NODEFILE' and get a list like
this:

n04
n04
n04
n04
n06
n06
n06
n06
n07
n07
n07
n07
n09
n09
n09
n09

i.e, 16 processors on 4 nodes. from which I can parse into file(s) as
desired.  If I want to run prog1 on 1 node (4 processors), prog2 on 1
node (4 processors), and prog3 on 2 nodes (8 processors), I think the
syntax will be something like:

 mpirun -np 4 --hostfile nodefile1 prog1: \
-np 4 --hostfile nodefile2 prog2: \
-np 8 --hostfile nodefile3 prog3

Where nodefile1, nodefile2, and nodefile3 are the lists extracted from
PBS_NODEFILE.  Is this correct?  Any suggestion/advice, (e.g. syntax  
of

the 'nodefiles'), is appreciated.

T. Rosmond



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] System hang-up on MPI_Reduce

2009-11-10 Thread Ralph Castain


Yeah, that is "normal". It has to do with unexpected messages.

When you have procs running at significantly different speeds, the  
various operations get far enough out of sync that the memory consumed  
by recvd messages not yet processed grows too large.


Instead of sticking barriers into your code, you can have OMPI do an  
internal sync after every so many operations to avoid the problem.  
This is done by enabling the "sync" collective component, and then  
adjusting the number of operations between forced syncs.


Do an "ompi_info --params coll sync" to see the options. Then set the  
coll_sync_priority to something like 100 and it should work for you.


Ralph

On Nov 10, 2009, at 2:45 PM, Glembek Ondřej wrote:


Hi,

I am using MPI_Reduce operation on 122880x400 matrix of doubles. The  
parallel job runs on 32 machines, each having different processor in  
terms of speed, but the architecture and OS is the same on all  
machines (x86_64). The task is a typical map-and-reduce, i.e. each  
of the processes collects some data, which is then summed  
(MPI_Reduce w. MPI_SUM).


Having different processors, each of the jobs comes to the  
MPI_Reduce in different time.


The *first problem* came when I called MPI_Reduce on the whole  
matrix. The system ended up with *MPI_ERR_OTHER error*, each time on  
different rank. I fixed this problem by chunking up the matrix into  
2048 submatrices, calling MPI_Reduce in cycle.


However *second problem* arose --- MPI_Reduce hangs up... It  
apparently gets stuck in some kind of dead-lock or something like  
that. It seems that if the processors are of similar speed, the  
problem disappears, however I cannot provide this condition all the  
time.


I managed to get rid of the problem (at least after few non- 
problematic iterations) by sticking MPI_Barrier before the  
MPI_Reduce line.


The questions are:

1) is this a usual behavior???
2) is there some kind of timeout for MPI_Reduce???
3) why does MPI_Reduce die on large amount of data if the system has  
enough address space (64 bit compilation)


Thanx
Ondrej Glembek


--
 Ondrej Glembek, PhD student  E-mail: glem...@fit.vutbr.cz
 UPGM FIT VUT Brno, L226  Web:http://www.fit.vutbr.cz/~glembek
 Bozetechova 2, 612 66Phone:  +420 54114-1292
 Brno, Czech Republic Fax:+420 54114-1290

 ICQ: 93233896
 GPG: C050 A6DC 7291 6776 9B69 BB11 C033 D756 6F33 DE3C


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] How do you get static linkage for Intel compiler libsfor the orterun executable?

2009-11-10 Thread Jeff Squyres

I'm away from icc help resources, but try the -static-intel compiler  
flag.



On Nov 10, 2009, at 2:51 PM, Blosch, Edwin L wrote:

I’m trying to build OpenMPI with Intel compilers, both static and  
dynamic libs, then move it to a system that does not have Intel  
compilers.  I don’t care about system libraries or OpenMPI loadable  
modules being dynamic, that’s all fine.  But I need the compiler  
libs to be statically linked into any executable.


I don’t seem to be smart enough to figure out how to get the Intel  
libs statically linked into the “orterun” command.


Can someone help suggest the right way to achieve this?

Here’s my configure command and the relevant output from the “make”  
inside tools/orterun.  Notice that I am passing –i-static in  
LDFLAGS, and it does indeed appear to have made it into the link  
line for orterun.  It just didn’t have the desired effect.  A  
subsequent ‘ldd’ shows that there is still a dependency on the  
libimf.so.


Thanks

./configure
--prefix=/release/cfd/openmpi-intel
--enable-mpirun-prefix-by-default
--enable-contrib-no-build=vt
--disable-per-user-config-files
--enable-mca-no-build=maffinity
--enable-static
--without-openib
--without-tm
--with-mpi-f90-size=small
CXX=/appserv/intel/cce/10.1.021/bin/icpc
CC=/appserv/intel/cce/10.1.021/bin/icc
'CFLAGS=  -O2'
'CXXFLAGS=  -O2'
F77=/appserv/intel/fce/10.1.021/bin/ifort
'FFLAGS=-D_GNU_SOURCE -fpe0 -no-ftz -traceback  -O2'
FC=/appserv/intel/fce/10.1.021/bin/ifort
'FCFLAGS=-D_GNU_SOURCE -fpe0 -no-ftz -traceback  -O2'
'LDFLAGS= -i-static'



make[2]: Entering directory `/home/bloscel/builds/openmpi/orte/tools/ 
orterun'

depbase=`echo main.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
/appserv/intel/cce/10.1.021/bin/icc -DHAVE_CONFIG_H -I. -I../../../ 
opal/include -I../../../orte/include -I../../../ompi/include - 
I../../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../..- 
DNDEBUG -finline-functions -fno-strict-aliasing -restrict -pthread - 
fvisibility=hidden -g -MT main.o -MD -MP -MF $depbase.Tpo -c -o  
main.o main.c &&\

mv -f $depbase.Tpo $depbase.Po
depbase=`echo orterun.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
/appserv/intel/cce/10.1.021/bin/icc -DHAVE_CONFIG_H -I. -I../../../ 
opal/include -I../../../orte/include -I../../../ompi/include - 
I../../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../..- 
DNDEBUG -finline-functions -fno-strict-aliasing -restrict -pthread - 
fvisibility=hidden -g -MT orterun.o -MD -MP -MF $depbase.Tpo -c -o  
orterun.o orterun.c &&\

mv -f $depbase.Tpo $depbase.Po
depbase=`echo debuggers.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
/appserv/intel/cce/10.1.021/bin/icc -DHAVE_CONFIG_H -I. -I../../../ 
opal/include -I../../../orte/include -I../../../ompi/include - 
I../../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../..- 
DNDEBUG -finline-functions -fno-strict-aliasing -restrict -pthread - 
fvisibility=hidden -g -MT debuggers.o -MD -MP -MF $depbase.Tpo -c -o  
debuggers.o debuggers.c &&\

mv -f $depbase.Tpo $depbase.Po
/bin/sh ../../../libtool --tag=CC   --mode=link /appserv/intel/cce/ 
10.1.021/bin/icc  -DNDEBUG -finline-functions -fno-strict-aliasing - 
restrict -pthread -fvisibility=hidden -g  -export-dynamic  -i- 
static  -o orterun main.o orterun.o debuggers.o ../../../orte/ 
libopen-rte.la -lnsl -lutil
libtool: link: /appserv/intel/cce/10.1.021/bin/icc -DNDEBUG -finline- 
functions -fno-strict-aliasing -restrict -pthread - 
fvisibility=hidden -g -i-static -o .libs/orterun main.o orterun.o  
debuggers.o -Wl,--export-dynamic  ../../../orte/.libs/libopen- 
rte.so /home/bloscel/builds/openmpi/opal/.libs/libopen-pal.so -ldl - 
lnsl -lutil -pthread -Wl,-rpath -Wl,/release/cfd/openmpi-intel/lib
/appserv/intel/cce/10.1.021/lib/libimf.so: warning: warning:  
feupdateenv is not implemented and will always fail




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
jsquy...@cisco.com

Re: [OMPI users] How do you get static linkage for Intel compiler libsfor the orterun executable?

2009-11-10 Thread Reuti


Am 10.11.2009 um 23:26 schrieb Jeff Squyres:

I'm away from icc help resources, but try the -static-intel  
compiler flag.


I also like the compiler specific libs to be linked in statically - I  
just rename the *.so to *.so.disabled. So the linker is forced to use  
the .a files of the Intel libs.


-- Reuti




On Nov 10, 2009, at 2:51 PM, Blosch, Edwin L wrote:

I’m trying to build OpenMPI with Intel compilers, both static and  
dynamic libs, then move it to a system that does not have Intel  
compilers.  I don’t care about system libraries or OpenMPI  
loadable modules being dynamic, that’s all fine.  But I need the  
compiler libs to be statically linked into any executable.


I don’t seem to be smart enough to figure out how to get the Intel  
libs statically linked into the “orterun” command.


Can someone help suggest the right way to achieve this?

Here’s my configure command and the relevant output from the  
“make” inside tools/orterun.  Notice that I am passing –i-static  
in LDFLAGS, and it does indeed appear to have made it into the  
link line for orterun.  It just didn’t have the desired effect.  A  
subsequent ‘ldd’ shows that there is still a dependency on the  
libimf.so.


Thanks

./configure
--prefix=/release/cfd/openmpi-intel
--enable-mpirun-prefix-by-default
--enable-contrib-no-build=vt
--disable-per-user-config-files
--enable-mca-no-build=maffinity
--enable-static
--without-openib
--without-tm
--with-mpi-f90-size=small
CXX=/appserv/intel/cce/10.1.021/bin/icpc
CC=/appserv/intel/cce/10.1.021/bin/icc
'CFLAGS=  -O2'
'CXXFLAGS=  -O2'
F77=/appserv/intel/fce/10.1.021/bin/ifort
'FFLAGS=-D_GNU_SOURCE -fpe0 -no-ftz -traceback  -O2'
FC=/appserv/intel/fce/10.1.021/bin/ifort
'FCFLAGS=-D_GNU_SOURCE -fpe0 -no-ftz -traceback  -O2'
'LDFLAGS= -i-static'



make[2]: Entering directory `/home/bloscel/builds/openmpi/orte/ 
tools/orterun'

depbase=`echo main.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
/appserv/intel/cce/10.1.021/bin/icc -DHAVE_CONFIG_H -I. -I../../../ 
opal/include -I../../../orte/include -I../../../ompi/include - 
I../../../opal/mca/paffinity/linux/plpa/src/libplpa   - 
I../../..-DNDEBUG -finline-functions -fno-strict-aliasing - 
restrict -pthread -fvisibility=hidden -g -MT main.o -MD -MP -MF  
$depbase.Tpo -c -o main.o main.c &&\

mv -f $depbase.Tpo $depbase.Po
depbase=`echo orterun.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
/appserv/intel/cce/10.1.021/bin/icc -DHAVE_CONFIG_H -I. -I../../../ 
opal/include -I../../../orte/include -I../../../ompi/include - 
I../../../opal/mca/paffinity/linux/plpa/src/libplpa   - 
I../../..-DNDEBUG -finline-functions -fno-strict-aliasing - 
restrict -pthread -fvisibility=hidden -g -MT orterun.o -MD -MP -MF  
$depbase.Tpo -c -o orterun.o orterun.c &&\

mv -f $depbase.Tpo $depbase.Po
depbase=`echo debuggers.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
/appserv/intel/cce/10.1.021/bin/icc -DHAVE_CONFIG_H -I. -I../../../ 
opal/include -I../../../orte/include -I../../../ompi/include - 
I../../../opal/mca/paffinity/linux/plpa/src/libplpa   - 
I../../..-DNDEBUG -finline-functions -fno-strict-aliasing - 
restrict -pthread -fvisibility=hidden -g -MT debuggers.o -MD -MP - 
MF $depbase.Tpo -c -o debuggers.o debuggers.c &&\

mv -f $depbase.Tpo $depbase.Po
/bin/sh ../../../libtool --tag=CC   --mode=link /appserv/intel/cce/ 
10.1.021/bin/icc  -DNDEBUG -finline-functions -fno-strict-aliasing  
-restrict -pthread -fvisibility=hidden -g  -export-dynamic  -i- 
static  -o orterun main.o orterun.o debuggers.o ../../../orte/ 
libopen-rte.la -lnsl -lutil
libtool: link: /appserv/intel/cce/10.1.021/bin/icc -DNDEBUG - 
finline-functions -fno-strict-aliasing -restrict -pthread - 
fvisibility=hidden -g -i-static -o .libs/orterun main.o orterun.o  
debuggers.o -Wl,--export-dynamic  ../../../orte/.libs/libopen- 
rte.so /home/bloscel/builds/openmpi/opal/.libs/libopen-pal.so -ldl  
-lnsl -lutil -pthread -Wl,-rpath -Wl,/release/cfd/openmpi-intel/lib
/appserv/intel/cce/10.1.021/lib/libimf.so: warning: warning:  
feupdateenv is not implemented and will always fail




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
jsquy...@cisco.com


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry


Hi Reuti,

I followed your advice and switched to a local "tmpdir" instead of a 
share one. This solved the session directory issue, thanks for your help !
However, I cannot understand how the issue disappeared. Any input would 
be welcome as I really like to understand how SGE/OpenMPI could failed 
when using such a configuration (i.e. with a shared "tmpdir").


Eloi


On 10/11/2009 19:17, Eloi Gaudry wrote:

Reuti,

The acl here were just added when I tried to force the /opt/sge/tmp 
subdirectories to be 777 (which I did when I first encountered the 
error of subdirectories creation within OpenMPI). I don't think the 
info I'll provide will be meaningfull here:


moe:~# getfacl /opt/sge/tmp
getfacl: Removing leading '/' from absolute path names
# file: opt/sge/tmp
# owner: sgeadmin
# group: fft
user::rwx
group::rwx
mask::rwx
other::rwx
default:user::rwx
default:group::rwx
default:group:fft:rwx
default:mask::rwx
default:other::rwx

I'll try to use a local directory instead of a shared one for 
"tmpdir". But as this issue seems somehow related to permissions, I 
don't know if this would eventually be the rigth solution.


Thanks for your help,
Eloi

Reuti wrote:

Hi,

Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:


Reuti,

I'm using "tmpdir" as a shared directory that contains the session 
directories created during job submission, not for computing or 
local storage. Doesn't the session directory (i.e. 
job_id.queue_name) need to be shared among all computing nodes (at 
least the ones that would be used with orted during the parallel 
computation) ?


no. orted runs happily with local $TMPDIR on each and every node. The 
$TMPDIRs are intended to be used by the user for any temporary data 
for his job, as they are created and removed by SGE automatically for 
every job for his convenience.



All sequential job run fine, as no write operation is performed in 
"tmpdir/session_directory".


All users are known on the computing nodes and the master node (with 
use ldap authentication on all nodes).


As for the access checkings:
moe:~# ls -alrtd /opt/sge/tmp
drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp


Aha, the + tells that there are some ACLs set:

getfacl /opt/sge/tmp



And for the parallel environment configuration:
moe:~# qconf -sp round_robin
pe_nameround_robin
slots  32
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$round_robin
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE


Okay, fine.

-- Reuti



Thanks for your help,
Eloi

Reuti wrote:

Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported from the 
master node to all others computing nodes.


It's higly advisable to have the "tmpdir" local on each node. When 
you use "cd $TMPDIR" in your jobscript, all is done local on a node 
(when your application will just create the scratch file in your 
current working directory) which will speed up the computation and 
decrease the network traffic. Computing in as shared /opt/sge/tmp 
is like computing in each user's home directory.


To avoid that any user can remove someone else's files, the "t" 
flag is set like for /tmp: drwxrwxrwt 14 root root 4096 2009-11-10 
18:35 /tmp/


Nevertheless:

 with for /etc/export on server (named moe.fft):   /opt/sge
192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
   /etc/fstab on 
client:
moe.fft:/opt/sge
/opt/sgenfs 
rw,bg,soft,timeo=14, 0 0
Actually, the /opt/sge/tmp directory is 777 across all machines, 
thus all user should be able to create a directory inside.


All access checkings will be applied:

- on the server: what is "ls -d /opt/sge/tmp" showing?
- the one from the export (this seems to be fine)
- the one on the node (i.e., how it's mounted: cat /etc/fstab)

The issue seems somehow related to the session directory created 
inside /opt/sge/tmp, let's stay /opt/sge/tmp/29.1.smp8.q for 
example for the job 29 on queue smp8.q. This subdirectory of 
/opt/sge/tmp is created with nobody:nogroup drwxr-xr-x 
permissions... which in turn forbids


Did you try to run some simple jobs before the parallel ones - are 
these working? The daemons (qmaster and execd) were started as root?


The user is known on the file server, i.e. the machine hosting 
/opt/sge?


OpenMPI to create its subtree inside (as OpenMPI won't use 
nobody:nogroup credentials).


In SGE the master process (the one running the job script) will 
create the /opt/sge/tmp/29.1.smp8.q  and also each started qrsh 
inside SGE - all with the same name. What is your definition of the 
PE in SGE which you use?


-- Reuti


Ad Ralph suggested, I checked the SGE configuration, but I haven't 
found anything related to nobody:nogroup configuration so far.

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti


Hi Eloi,

Am 10.11.2009 um 23:42 schrieb Eloi Gaudry:

I followed your advice and switched to a local "tmpdir" instead of  
a share one. This solved the session directory issue, thanks for  
your help !


what user/group is no listed for the generated temporary directories  
(i.e. $TMPDIR)?


-- Reuti

However, I cannot understand how the issue disappeared. Any input  
would be welcome as I really like to understand how SGE/OpenMPI  
could failed when using such a configuration (i.e. with a shared  
"tmpdir").


Eloi


On 10/11/2009 19:17, Eloi Gaudry wrote:

Reuti,

The acl here were just added when I tried to force the /opt/sge/ 
tmp subdirectories to be 777 (which I did when I first encountered  
the error of subdirectories creation within OpenMPI). I don't  
think the info I'll provide will be meaningfull here:


moe:~# getfacl /opt/sge/tmp
getfacl: Removing leading '/' from absolute path names
# file: opt/sge/tmp
# owner: sgeadmin
# group: fft
user::rwx
group::rwx
mask::rwx
other::rwx
default:user::rwx
default:group::rwx
default:group:fft:rwx
default:mask::rwx
default:other::rwx

I'll try to use a local directory instead of a shared one for  
"tmpdir". But as this issue seems somehow related to permissions,  
I don't know if this would eventually be the rigth solution.


Thanks for your help,
Eloi

Reuti wrote:

Hi,

Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:


Reuti,

I'm using "tmpdir" as a shared directory that contains the  
session directories created during job submission, not for  
computing or local storage. Doesn't the session directory (i.e.  
job_id.queue_name) need to be shared among all computing nodes  
(at least the ones that would be used with orted during the  
parallel computation) ?


no. orted runs happily with local $TMPDIR on each and every node.  
The $TMPDIRs are intended to be used by the user for any  
temporary data for his job, as they are created and removed by  
SGE automatically for every job for his convenience.



All sequential job run fine, as no write operation is performed  
in "tmpdir/session_directory".


All users are known on the computing nodes and the master node  
(with use ldap authentication on all nodes).


As for the access checkings:
moe:~# ls -alrtd /opt/sge/tmp
drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp


Aha, the + tells that there are some ACLs set:

getfacl /opt/sge/tmp



And for the parallel environment configuration:
moe:~# qconf -sp round_robin
pe_nameround_robin
slots  32
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$round_robin
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE


Okay, fine.

-- Reuti



Thanks for your help,
Eloi

Reuti wrote:

Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported from  
the master node to all others computing nodes.


It's higly advisable to have the "tmpdir" local on each node.  
When you use "cd $TMPDIR" in your jobscript, all is done local  
on a node (when your application will just create the scratch  
file in your current working directory) which will speed up the  
computation and decrease the network traffic. Computing in as  
shared /opt/sge/tmp is like computing in each user's home  
directory.


To avoid that any user can remove someone else's files, the "t"  
flag is set like for /tmp: drwxrwxrwt 14 root root 4096  
2009-11-10 18:35 /tmp/


Nevertheless:

 with for /etc/export on server (named moe.fft):   /opt/sge 
192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
   /etc/fstab on  
client:moe.fft:/opt/ 
sge/opt/ 
sgenfs  
rw,bg,soft,timeo=14, 0 0
Actually, the /opt/sge/tmp directory is 777 across all  
machines, thus all user should be able to create a directory  
inside.


All access checkings will be applied:

- on the server: what is "ls -d /opt/sge/tmp" showing?
- the one from the export (this seems to be fine)
- the one on the node (i.e., how it's mounted: cat /etc/fstab)

The issue seems somehow related to the session directory  
created inside /opt/sge/tmp, let's stay /opt/sge/tmp/ 
29.1.smp8.q for example for the job 29 on queue smp8.q. This  
subdirectory of /opt/sge/tmp is created with nobody:nogroup  
drwxr-xr-x permissions... which in turn forbids


Did you try to run some simple jobs before the parallel ones -  
are these working? The daemons (qmaster and execd) were started  
as root?


The user is known on the file server, i.e. the machine hosting / 
opt/sge?


OpenMPI to create its subtree inside (as OpenMPI won't use  
nobody:nogroup credentials).


In SGE the master process (the one running the job script) will  
create the /opt/sge/tmp/29.1.smp8.q  and also each started qrsh  
inside SGE - all with the same name. What

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti


Am 10.11.2009 um 23:51 schrieb Reuti:


Hi Eloi,

Am 10.11.2009 um 23:42 schrieb Eloi Gaudry:

I followed your advice and switched to a local "tmpdir" instead of  
a share one. This solved the session directory issue, thanks for  
your help !


what user/group is no listed for the generated temporary  
directories (i.e. $TMPDIR)?


...is  now listed ...


-- Reuti

However, I cannot understand how the issue disappeared. Any input  
would be welcome as I really like to understand how SGE/OpenMPI  
could failed when using such a configuration (i.e. with a shared  
"tmpdir").


Eloi


On 10/11/2009 19:17, Eloi Gaudry wrote:

Reuti,

The acl here were just added when I tried to force the /opt/sge/ 
tmp subdirectories to be 777 (which I did when I first  
encountered the error of subdirectories creation within OpenMPI).  
I don't think the info I'll provide will be meaningfull here:


moe:~# getfacl /opt/sge/tmp
getfacl: Removing leading '/' from absolute path names
# file: opt/sge/tmp
# owner: sgeadmin
# group: fft
user::rwx
group::rwx
mask::rwx
other::rwx
default:user::rwx
default:group::rwx
default:group:fft:rwx
default:mask::rwx
default:other::rwx

I'll try to use a local directory instead of a shared one for  
"tmpdir". But as this issue seems somehow related to permissions,  
I don't know if this would eventually be the rigth solution.


Thanks for your help,
Eloi

Reuti wrote:

Hi,

Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:


Reuti,

I'm using "tmpdir" as a shared directory that contains the  
session directories created during job submission, not for  
computing or local storage. Doesn't the session directory (i.e.  
job_id.queue_name) need to be shared among all computing nodes  
(at least the ones that would be used with orted during the  
parallel computation) ?


no. orted runs happily with local $TMPDIR on each and every  
node. The $TMPDIRs are intended to be used by the user for any  
temporary data for his job, as they are created and removed by  
SGE automatically for every job for his convenience.



All sequential job run fine, as no write operation is performed  
in "tmpdir/session_directory".


All users are known on the computing nodes and the master node  
(with use ldap authentication on all nodes).


As for the access checkings:
moe:~# ls -alrtd /opt/sge/tmp
drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp


Aha, the + tells that there are some ACLs set:

getfacl /opt/sge/tmp



And for the parallel environment configuration:
moe:~# qconf -sp round_robin
pe_nameround_robin
slots  32
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$round_robin
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE


Okay, fine.

-- Reuti



Thanks for your help,
Eloi

Reuti wrote:

Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported  
from the master node to all others computing nodes.


It's higly advisable to have the "tmpdir" local on each node.  
When you use "cd $TMPDIR" in your jobscript, all is done local  
on a node (when your application will just create the scratch  
file in your current working directory) which will speed up  
the computation and decrease the network traffic. Computing in  
as shared /opt/sge/tmp is like computing in each user's home  
directory.


To avoid that any user can remove someone else's files, the  
"t" flag is set like for /tmp: drwxrwxrwt 14 root root 4096  
2009-11-10 18:35 /tmp/


Nevertheless:

 with for /etc/export on server (named moe.fft):   /opt/ 
sge192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
   /etc/fstab on  
client:moe.fft:/opt/ 
sge/opt/ 
sgenfs  
rw,bg,soft,timeo=14, 0 0
Actually, the /opt/sge/tmp directory is 777 across all  
machines, thus all user should be able to create a directory  
inside.


All access checkings will be applied:

- on the server: what is "ls -d /opt/sge/tmp" showing?
- the one from the export (this seems to be fine)
- the one on the node (i.e., how it's mounted: cat /etc/fstab)

The issue seems somehow related to the session directory  
created inside /opt/sge/tmp, let's stay /opt/sge/tmp/ 
29.1.smp8.q for example for the job 29 on queue smp8.q. This  
subdirectory of /opt/sge/tmp is created with nobody:nogroup  
drwxr-xr-x permissions... which in turn forbids


Did you try to run some simple jobs before the parallel ones -  
are these working? The daemons (qmaster and execd) were  
started as root?


The user is known on the file server, i.e. the machine  
hosting /opt/sge?


OpenMPI to create its subtree inside (as OpenMPI won't use  
nobody:nogroup credentials).


In SGE the master process (the one running the job script)  
will create the /opt/sge/tmp/29.1.smp8.q  and als

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti


Am 11.11.2009 um 00:03 schrieb Eloi Gaudry:

The user/group used to generate the temporary directories was  
nobody/nogroup, when using a shared $tmpdir.
Now that I'm using a local $tmpdir (one for each node, not  
distributed over nfs), the right credentials (i.e. my username/ 
groupname) are used to create the session directory inside $tmpdir,  
which in turn allows OpenMPI to successfully create its session  
subdirectories.


Aha, this explains why it's working now - so it's not an SGE issue IMHO.

Question: when a user on the execution node goes to /opt/sge/tmp and  
creates a directory on the command line with mkdir: what group/user  
is used then?


-- Reuti



Eloi


On 10/11/2009 23:51, Reuti wrote:

Hi Eloi,

Am 10.11.2009 um 23:42 schrieb Eloi Gaudry:


I followed your advice and switched to a local "tmpdir" instead of a
share one. This solved the session directory issue, thanks for your
help !


what user/group is no listed for the generated temporary directories
(i.e. $TMPDIR)?

-- Reuti


However, I cannot understand how the issue disappeared. Any input
would be welcome as I really like to understand how SGE/OpenMPI  
could
failed when using such a configuration (i.e. with a shared  
"tmpdir").


Eloi


On 10/11/2009 19:17, Eloi Gaudry wrote:

Reuti,

The acl here were just added when I tried to force the /opt/sge/tmp
subdirectories to be 777 (which I did when I first encountered the
error of subdirectories creation within OpenMPI). I don't think the
info I'll provide will be meaningfull here:

moe:~# getfacl /opt/sge/tmp
getfacl: Removing leading '/' from absolute path names
# file: opt/sge/tmp
# owner: sgeadmin
# group: fft
user::rwx
group::rwx
mask::rwx
other::rwx
default:user::rwx
default:group::rwx
default:group:fft:rwx
default:mask::rwx
default:other::rwx

I'll try to use a local directory instead of a shared one for
"tmpdir". But as this issue seems somehow related to permissions, I
don't know if this would eventually be the rigth solution.

Thanks for your help,
Eloi

Reuti wrote:

Hi,

Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:


Reuti,

I'm using "tmpdir" as a shared directory that contains the  
session

directories created during job submission, not for computing or
local storage. Doesn't the session directory (i.e.
job_id.queue_name) need to be shared among all computing nodes  
(at

least the ones that would be used with orted during the parallel
computation) ?


no. orted runs happily with local $TMPDIR on each and every node.
The $TMPDIRs are intended to be used by the user for any temporary
data for his job, as they are created and removed by SGE
automatically for every job for his convenience.


All sequential job run fine, as no write operation is  
performed in

"tmpdir/session_directory".

All users are known on the computing nodes and the master node
(with use ldap authentication on all nodes).

As for the access checkings:
moe:~# ls -alrtd /opt/sge/tmp
drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp


Aha, the + tells that there are some ACLs set:

getfacl /opt/sge/tmp



And for the parallel environment configuration:
moe:~# qconf -sp round_robin
pe_nameround_robin
slots  32
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$round_robin
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE


Okay, fine.

-- Reuti



Thanks for your help,
Eloi

Reuti wrote:

Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported from
the master node to all others computing nodes.


It's higly advisable to have the "tmpdir" local on each node.
When you use "cd $TMPDIR" in your jobscript, all is done  
local on

a node (when your application will just create the scratch file
in your current working directory) which will speed up the
computation and decrease the network traffic. Computing in as
shared /opt/sge/tmp is like computing in each user's home  
directory.


To avoid that any user can remove someone else's files, the "t"
flag is set like for /tmp: drwxrwxrwt 14 root root 4096
2009-11-10 18:35 /tmp/

Nevertheless:


 with for /etc/export on server (named moe.fft):   /opt/sge
192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
   /etc/fstab on
client:
moe.fft:/opt/sge
/opt/sgenfs
rw,bg,soft,timeo=14, 0 0
Actually, the /opt/sge/tmp directory is 777 across all  
machines,

thus all user should be able to create a directory inside.


All access checkings will be applied:

- on the server: what is "ls -d /opt/sge/tmp" showing?
- the one from the export (this seems to be fine)
- the one on the node (i.e., how it's mounted: cat /etc/fstab)

The issue seems somehow related to the session directory  
created

inside /opt/sge/tmp, let's stay /opt/sge/tmp/29.1.smp8.q for
example for the job 29 on queue smp8.q. This sub

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry

On any execution node, creating a subdirectory of /opt/sge/tmp (i.e. 
creating a session directory inside $TMPDIR) results in a new directory 
own by the user/group that submitted the job (not nobody/nogroup).
If I switch back to a shared /opt/sge/tmp directory, all session 
directories created by sge got nobody/nogroup as owner.


Eloi

On 11/11/2009 00:14, Reuti wrote:

Am 11.11.2009 um 00:03 schrieb Eloi Gaudry:

The user/group used to generate the temporary directories was 
nobody/nogroup, when using a shared $tmpdir.
Now that I'm using a local $tmpdir (one for each node, not 
distributed over nfs), the right credentials (i.e. my 
username/groupname) are used to create the session directory inside 
$tmpdir, which in turn allows OpenMPI to successfully create its 
session subdirectories.


Aha, this explains why it's working now - so it's not an SGE issue IMHO.

Question: when a user on the execution node goes to /opt/sge/tmp and 
creates a directory on the command line with mkdir: what group/user is 
used then?


-- Reuti



Eloi


On 10/11/2009 23:51, Reuti wrote:

Hi Eloi,

Am 10.11.2009 um 23:42 schrieb Eloi Gaudry:


I followed your advice and switched to a local "tmpdir" instead of a
share one. This solved the session directory issue, thanks for your
help !


what user/group is no listed for the generated temporary directories
(i.e. $TMPDIR)?

-- Reuti


However, I cannot understand how the issue disappeared. Any input
would be welcome as I really like to understand how SGE/OpenMPI could
failed when using such a configuration (i.e. with a shared "tmpdir").

Eloi


On 10/11/2009 19:17, Eloi Gaudry wrote:

Reuti,

The acl here were just added when I tried to force the /opt/sge/tmp
subdirectories to be 777 (which I did when I first encountered the
error of subdirectories creation within OpenMPI). I don't think the
info I'll provide will be meaningfull here:

moe:~# getfacl /opt/sge/tmp
getfacl: Removing leading '/' from absolute path names
# file: opt/sge/tmp
# owner: sgeadmin
# group: fft
user::rwx
group::rwx
mask::rwx
other::rwx
default:user::rwx
default:group::rwx
default:group:fft:rwx
default:mask::rwx
default:other::rwx

I'll try to use a local directory instead of a shared one for
"tmpdir". But as this issue seems somehow related to permissions, I
don't know if this would eventually be the rigth solution.

Thanks for your help,
Eloi

Reuti wrote:

Hi,

Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:


Reuti,

I'm using "tmpdir" as a shared directory that contains the session
directories created during job submission, not for computing or
local storage. Doesn't the session directory (i.e.
job_id.queue_name) need to be shared among all computing nodes (at
least the ones that would be used with orted during the parallel
computation) ?


no. orted runs happily with local $TMPDIR on each and every node.
The $TMPDIRs are intended to be used by the user for any temporary
data for his job, as they are created and removed by SGE
automatically for every job for his convenience.



All sequential job run fine, as no write operation is performed in
"tmpdir/session_directory".

All users are known on the computing nodes and the master node
(with use ldap authentication on all nodes).

As for the access checkings:
moe:~# ls -alrtd /opt/sge/tmp
drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp


Aha, the + tells that there are some ACLs set:

getfacl /opt/sge/tmp



And for the parallel environment configuration:
moe:~# qconf -sp round_robin
pe_nameround_robin
slots  32
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$round_robin
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE


Okay, fine.

-- Reuti



Thanks for your help,
Eloi

Reuti wrote:

Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported from
the master node to all others computing nodes.


It's higly advisable to have the "tmpdir" local on each node.
When you use "cd $TMPDIR" in your jobscript, all is done local on
a node (when your application will just create the scratch file
in your current working directory) which will speed up the
computation and decrease the network traffic. Computing in as
shared /opt/sge/tmp is like computing in each user's home 
directory.


To avoid that any user can remove someone else's files, the "t"
flag is set like for /tmp: drwxrwxrwt 14 root root 4096
2009-11-10 18:35 /tmp/

Nevertheless:


 with for /etc/export on server (named moe.fft):   /opt/sge
192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
   /etc/fstab on
client:
moe.fft:/opt/sge
/opt/sgenfs
rw,bg,soft,timeo=14, 0 0
Actually, the /opt/sge/tmp directory is 777 across all machines,
thus all user should be able to create a directory inside.


All access checkings will be applied:

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti


To avoid misunderstandings:

Am 11.11.2009 um 00:19 schrieb Eloi Gaudry:

On any execution node, creating a subdirectory of /opt/sge/tmp  
(i.e. creating a session directory inside $TMPDIR) results in a new  
directory own by the user/group that submitted the job (not nobody/ 
nogroup).


$TMPDIR is in this case /opt/sge/tmp/..

I really meant to create a directory in /opt/sge/tmp by hand with  
mkdir, but on the execution node which mounts /opt/sge.


-- Reuti


If I switch back to a shared /opt/sge/tmp directory, all session  
directories created by sge got nobody/nogroup as owner.


Eloi

On 11/11/2009 00:14, Reuti wrote:

Am 11.11.2009 um 00:03 schrieb Eloi Gaudry:

The user/group used to generate the temporary directories was  
nobody/nogroup, when using a shared $tmpdir.
Now that I'm using a local $tmpdir (one for each node, not  
distributed over nfs), the right credentials (i.e. my username/ 
groupname) are used to create the session directory inside  
$tmpdir, which in turn allows OpenMPI to successfully create its  
session subdirectories.


Aha, this explains why it's working now - so it's not an SGE issue  
IMHO.


Question: when a user on the execution node goes to /opt/sge/tmp  
and creates a directory on the command line with mkdir: what group/ 
user is used then?


-- Reuti



Eloi


On 10/11/2009 23:51, Reuti wrote:

Hi Eloi,

Am 10.11.2009 um 23:42 schrieb Eloi Gaudry:

I followed your advice and switched to a local "tmpdir" instead  
of a
share one. This solved the session directory issue, thanks for  
your

help !


what user/group is no listed for the generated temporary  
directories

(i.e. $TMPDIR)?

-- Reuti


However, I cannot understand how the issue disappeared. Any input
would be welcome as I really like to understand how SGE/OpenMPI  
could
failed when using such a configuration (i.e. with a shared  
"tmpdir").


Eloi


On 10/11/2009 19:17, Eloi Gaudry wrote:

Reuti,

The acl here were just added when I tried to force the /opt/ 
sge/tmp
subdirectories to be 777 (which I did when I first encountered  
the
error of subdirectories creation within OpenMPI). I don't  
think the

info I'll provide will be meaningfull here:

moe:~# getfacl /opt/sge/tmp
getfacl: Removing leading '/' from absolute path names
# file: opt/sge/tmp
# owner: sgeadmin
# group: fft
user::rwx
group::rwx
mask::rwx
other::rwx
default:user::rwx
default:group::rwx
default:group:fft:rwx
default:mask::rwx
default:other::rwx

I'll try to use a local directory instead of a shared one for
"tmpdir". But as this issue seems somehow related to  
permissions, I

don't know if this would eventually be the rigth solution.

Thanks for your help,
Eloi

Reuti wrote:

Hi,

Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:


Reuti,

I'm using "tmpdir" as a shared directory that contains the  
session

directories created during job submission, not for computing or
local storage. Doesn't the session directory (i.e.
job_id.queue_name) need to be shared among all computing  
nodes (at
least the ones that would be used with orted during the  
parallel

computation) ?


no. orted runs happily with local $TMPDIR on each and every  
node.
The $TMPDIRs are intended to be used by the user for any  
temporary

data for his job, as they are created and removed by SGE
automatically for every job for his convenience.


All sequential job run fine, as no write operation is  
performed in

"tmpdir/session_directory".

All users are known on the computing nodes and the master node
(with use ldap authentication on all nodes).

As for the access checkings:
moe:~# ls -alrtd /opt/sge/tmp
drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp


Aha, the + tells that there are some ACLs set:

getfacl /opt/sge/tmp



And for the parallel environment configuration:
moe:~# qconf -sp round_robin
pe_nameround_robin
slots  32
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$round_robin
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE


Okay, fine.

-- Reuti



Thanks for your help,
Eloi

Reuti wrote:

Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported  
from

the master node to all others computing nodes.


It's higly advisable to have the "tmpdir" local on each node.
When you use "cd $TMPDIR" in your jobscript, all is done  
local on
a node (when your application will just create the scratch  
file

in your current working directory) which will speed up the
computation and decrease the network traffic. Computing in as
shared /opt/sge/tmp is like computing in each user's home  
directory.


To avoid that any user can remove someone else's files, the  
"t"

flag is set like for /tmp: drwxrwxrwt 14 root root 4096
2009-11-10 18:35 /tmp/

Nevertheless:


 with for /etc/export on server (named moe.fft):   /opt/sge
192.168.0.0/255.255.

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry

This is what I did (create by hand /opt/sge/tmp/test on an execution 
host log as a regular cluster user).


Eloi

On 11/11/2009 00:26, Reuti wrote:

To avoid misunderstandings:

Am 11.11.2009 um 00:19 schrieb Eloi Gaudry:

On any execution node, creating a subdirectory of /opt/sge/tmp (i.e. 
creating a session directory inside $TMPDIR) results in a new 
directory own by the user/group that submitted the job (not 
nobody/nogroup).


$TMPDIR is in this case /opt/sge/tmp/..

I really meant to create a directory in /opt/sge/tmp by hand with 
mkdir, but on the execution node which mounts /opt/sge.


-- Reuti


If I switch back to a shared /opt/sge/tmp directory, all session 
directories created by sge got nobody/nogroup as owner.


Eloi

On 11/11/2009 00:14, Reuti wrote:

Am 11.11.2009 um 00:03 schrieb Eloi Gaudry:

The user/group used to generate the temporary directories was 
nobody/nogroup, when using a shared $tmpdir.
Now that I'm using a local $tmpdir (one for each node, not 
distributed over nfs), the right credentials (i.e. my 
username/groupname) are used to create the session directory inside 
$tmpdir, which in turn allows OpenMPI to successfully create its 
session subdirectories.


Aha, this explains why it's working now - so it's not an SGE issue 
IMHO.


Question: when a user on the execution node goes to /opt/sge/tmp and 
creates a directory on the command line with mkdir: what group/user 
is used then?


-- Reuti



Eloi


On 10/11/2009 23:51, Reuti wrote:

Hi Eloi,

Am 10.11.2009 um 23:42 schrieb Eloi Gaudry:


I followed your advice and switched to a local "tmpdir" instead of a
share one. This solved the session directory issue, thanks for your
help !


what user/group is no listed for the generated temporary directories
(i.e. $TMPDIR)?

-- Reuti


However, I cannot understand how the issue disappeared. Any input
would be welcome as I really like to understand how SGE/OpenMPI 
could
failed when using such a configuration (i.e. with a shared 
"tmpdir").


Eloi


On 10/11/2009 19:17, Eloi Gaudry wrote:

Reuti,

The acl here were just added when I tried to force the /opt/sge/tmp
subdirectories to be 777 (which I did when I first encountered the
error of subdirectories creation within OpenMPI). I don't think the
info I'll provide will be meaningfull here:

moe:~# getfacl /opt/sge/tmp
getfacl: Removing leading '/' from absolute path names
# file: opt/sge/tmp
# owner: sgeadmin
# group: fft
user::rwx
group::rwx
mask::rwx
other::rwx
default:user::rwx
default:group::rwx
default:group:fft:rwx
default:mask::rwx
default:other::rwx

I'll try to use a local directory instead of a shared one for
"tmpdir". But as this issue seems somehow related to permissions, I
don't know if this would eventually be the rigth solution.

Thanks for your help,
Eloi

Reuti wrote:

Hi,

Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:


Reuti,

I'm using "tmpdir" as a shared directory that contains the 
session

directories created during job submission, not for computing or
local storage. Doesn't the session directory (i.e.
job_id.queue_name) need to be shared among all computing nodes 
(at

least the ones that would be used with orted during the parallel
computation) ?


no. orted runs happily with local $TMPDIR on each and every node.
The $TMPDIRs are intended to be used by the user for any temporary
data for his job, as they are created and removed by SGE
automatically for every job for his convenience.


All sequential job run fine, as no write operation is 
performed in

"tmpdir/session_directory".

All users are known on the computing nodes and the master node
(with use ldap authentication on all nodes).

As for the access checkings:
moe:~# ls -alrtd /opt/sge/tmp
drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp


Aha, the + tells that there are some ACLs set:

getfacl /opt/sge/tmp



And for the parallel environment configuration:
moe:~# qconf -sp round_robin
pe_nameround_robin
slots  32
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$round_robin
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE


Okay, fine.

-- Reuti



Thanks for your help,
Eloi

Reuti wrote:

Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp), exported from
the master node to all others computing nodes.


It's higly advisable to have the "tmpdir" local on each node.
When you use "cd $TMPDIR" in your jobscript, all is done 
local on

a node (when your application will just create the scratch file
in your current working directory) which will speed up the
computation and decrease the network traffic. Computing in as
shared /opt/sge/tmp is like computing in each user's home 
directory.


To avoid that any user can remove someone else's files, the "t"
flag is set like for /tmp: drwxrwxrwt 14 root root 4096
2009-11-10 18:35 /tmp/

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti


Am 11.11.2009 um 00:29 schrieb Eloi Gaudry:

This is what I did (create by hand /opt/sge/tmp/test on an  
execution host log as a regular cluster user).


Then we end up where I started to think first, but I missed the  
implied default: can you export /opt/sge with "no_root_squash" and  
reload the nfsserver. SGE will create the $TMPDIR as root/root and  
then changes the uid/gid - both fails as it's squashed to nobody/ 
nogroup.


But having $TMPDIR local is still an advantage. Even SGE's spool  
directrories can be local: http://gridengine.sunsource.net/howto/ 
nfsreduce.html


-- Reuti



Eloi

On 11/11/2009 00:26, Reuti wrote:

To avoid misunderstandings:

Am 11.11.2009 um 00:19 schrieb Eloi Gaudry:

On any execution node, creating a subdirectory of /opt/sge/tmp  
(i.e. creating a session directory inside $TMPDIR) results in a  
new directory own by the user/group that submitted the job (not  
nobody/nogroup).


$TMPDIR is in this case /opt/sge/tmp/..

I really meant to create a directory in /opt/sge/tmp by hand with  
mkdir, but on the execution node which mounts /opt/sge.


-- Reuti


If I switch back to a shared /opt/sge/tmp directory, all session  
directories created by sge got nobody/nogroup as owner.


Eloi

On 11/11/2009 00:14, Reuti wrote:

Am 11.11.2009 um 00:03 schrieb Eloi Gaudry:

The user/group used to generate the temporary directories was  
nobody/nogroup, when using a shared $tmpdir.
Now that I'm using a local $tmpdir (one for each node, not  
distributed over nfs), the right credentials (i.e. my username/ 
groupname) are used to create the session directory inside  
$tmpdir, which in turn allows OpenMPI to successfully create  
its session subdirectories.


Aha, this explains why it's working now - so it's not an SGE  
issue IMHO.


Question: when a user on the execution node goes to /opt/sge/tmp  
and creates a directory on the command line with mkdir: what  
group/user is used then?


-- Reuti



Eloi


On 10/11/2009 23:51, Reuti wrote:

Hi Eloi,

Am 10.11.2009 um 23:42 schrieb Eloi Gaudry:

I followed your advice and switched to a local "tmpdir"  
instead of a
share one. This solved the session directory issue, thanks  
for your

help !


what user/group is no listed for the generated temporary  
directories

(i.e. $TMPDIR)?

-- Reuti

However, I cannot understand how the issue disappeared. Any  
input
would be welcome as I really like to understand how SGE/ 
OpenMPI could
failed when using such a configuration (i.e. with a shared  
"tmpdir").


Eloi


On 10/11/2009 19:17, Eloi Gaudry wrote:

Reuti,

The acl here were just added when I tried to force the /opt/ 
sge/tmp
subdirectories to be 777 (which I did when I first  
encountered the
error of subdirectories creation within OpenMPI). I don't  
think the

info I'll provide will be meaningfull here:

moe:~# getfacl /opt/sge/tmp
getfacl: Removing leading '/' from absolute path names
# file: opt/sge/tmp
# owner: sgeadmin
# group: fft
user::rwx
group::rwx
mask::rwx
other::rwx
default:user::rwx
default:group::rwx
default:group:fft:rwx
default:mask::rwx
default:other::rwx

I'll try to use a local directory instead of a shared one for
"tmpdir". But as this issue seems somehow related to  
permissions, I

don't know if this would eventually be the rigth solution.

Thanks for your help,
Eloi

Reuti wrote:

Hi,

Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:


Reuti,

I'm using "tmpdir" as a shared directory that contains the  
session
directories created during job submission, not for  
computing or

local storage. Doesn't the session directory (i.e.
job_id.queue_name) need to be shared among all computing  
nodes (at
least the ones that would be used with orted during the  
parallel

computation) ?


no. orted runs happily with local $TMPDIR on each and every  
node.
The $TMPDIRs are intended to be used by the user for any  
temporary

data for his job, as they are created and removed by SGE
automatically for every job for his convenience.


All sequential job run fine, as no write operation is  
performed in

"tmpdir/session_directory".

All users are known on the computing nodes and the master  
node

(with use ldap authentication on all nodes).

As for the access checkings:
moe:~# ls -alrtd /opt/sge/tmp
drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp


Aha, the + tells that there are some ACLs set:

getfacl /opt/sge/tmp



And for the parallel environment configuration:
moe:~# qconf -sp round_robin
pe_nameround_robin
slots  32
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$round_robin
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE


Okay, fine.

-- Reuti



Thanks for your help,
Eloi

Reuti wrote:

Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:


Thanks for your help Reuti,

I'm using a nfs-shared directory (/opt/sge/tmp),  
exported from

the master node to all others comput

Re: [OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Tom Rosmond

Ralph,

I am using 1.3.2, so the relative node syntax certainly seems the way to
go.  However, I seem to be missing something.  On the 'orte_hosts' man
page near the top is the simple example:

 mpirun -pernode -host +n1,+n2 ./app1 : -host +n3,+n4 ./app2

I set up my job to run on 4 nodes (4 processors/node), and slavishly
copied this line into my PBS script.  However, I got the following error
message:

--
mpirun found multiple applications specified on the command line, with
at least one that failed to specify the number of processes to execute.
When specifying multiple applications, you must specify how many
processes of each to launch via the -np argument.
--


I suspect an '-npernode 4' option, rather than '-pernode', is what I
really need, since I want 4 processes per node.  Either way, however, I
don't think that explains the above error message.  Correct?  Do I still
need to extract node-name information from the PBS_NODEFILE for this
approach, and replace n1, n2, etc, with the actual node-names?

T. Rosmond




On Tue, 2009-11-10 at 14:54 -0700, Ralph Castain wrote:
> What version are you trying to do this with?
> 
> Reason I ask: in 1.3.x, we introduced relative node syntax for  
> specifying hosts to use. This would eliminate the need to create the  
> hostfiles.
> 
> You might do a "man orte_hosts" (assuming you installed the man pages)  
> and see what it says.
> 
> Ralph
> 
> On Nov 10, 2009, at 2:46 PM, Tom Rosmond wrote:
> 
> > I want to run a number of MPI executables simultaneously in a PBS job.
> > For example on my system I do 'cat $PBS_NODEFILE' and get a list like
> > this:
> >
> > n04
> > n04
> > n04
> > n04
> > n06
> > n06
> > n06
> > n06
> > n07
> > n07
> > n07
> > n07
> > n09
> > n09
> > n09
> > n09
> >
> > i.e, 16 processors on 4 nodes. from which I can parse into file(s) as
> > desired.  If I want to run prog1 on 1 node (4 processors), prog2 on 1
> > node (4 processors), and prog3 on 2 nodes (8 processors), I think the
> > syntax will be something like:
> >
> >  mpirun -np 4 --hostfile nodefile1 prog1: \
> > -np 4 --hostfile nodefile2 prog2: \
> > -np 8 --hostfile nodefile3 prog3
> >
> > Where nodefile1, nodefile2, and nodefile3 are the lists extracted from
> > PBS_NODEFILE.  Is this correct?  Any suggestion/advice, (e.g. syntax  
> > of
> > the 'nodefiles'), is appreciated.
> >
> > T. Rosmond
> >
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Ralph Castain

You can use the relative host syntax, but you cannot use a "pernode"  
or "npernode" option when you have more than one application on the  
cmd line. You have to specify the number of procs for each  
application, as the error message says. :-)


IIRC, the reason was that we couldn't decide on how to interpret the  
cmd line - though looking at this example, I think I could figure it  
out. Anyway, that is the problem.


HTH
Ralph

On Nov 10, 2009, at 5:48 PM, Tom Rosmond wrote:


Ralph,

I am using 1.3.2, so the relative node syntax certainly seems the  
way to

go.  However, I seem to be missing something.  On the 'orte_hosts' man
page near the top is the simple example:

mpirun -pernode -host +n1,+n2 ./app1 : -host +n3,+n4 ./app2

I set up my job to run on 4 nodes (4 processors/node), and slavishly
copied this line into my PBS script.  However, I got the following  
error

message:

--
mpirun found multiple applications specified on the command line, with
at least one that failed to specify the number of processes to  
execute.

When specifying multiple applications, you must specify how many
processes of each to launch via the -np argument.
--


I suspect an '-npernode 4' option, rather than '-pernode', is what I
really need, since I want 4 processes per node.  Either way,  
however, I
don't think that explains the above error message.  Correct?  Do I  
still

need to extract node-name information from the PBS_NODEFILE for this
approach, and replace n1, n2, etc, with the actual node-names?

T. Rosmond




On Tue, 2009-11-10 at 14:54 -0700, Ralph Castain wrote:

What version are you trying to do this with?

Reason I ask: in 1.3.x, we introduced relative node syntax for
specifying hosts to use. This would eliminate the need to create the
hostfiles.

You might do a "man orte_hosts" (assuming you installed the man  
pages)

and see what it says.

Ralph

On Nov 10, 2009, at 2:46 PM, Tom Rosmond wrote:

I want to run a number of MPI executables simultaneously in a PBS  
job.
For example on my system I do 'cat $PBS_NODEFILE' and get a list  
like

this:

n04
n04
n04
n04
n06
n06
n06
n06
n07
n07
n07
n07
n09
n09
n09
n09

i.e, 16 processors on 4 nodes. from which I can parse into file(s)  
as
desired.  If I want to run prog1 on 1 node (4 processors), prog2  
on 1
node (4 processors), and prog3 on 2 nodes (8 processors), I think  
the

syntax will be something like:

mpirun -np 4 --hostfile nodefile1 prog1: \
   -np 4 --hostfile nodefile2 prog2: \
   -np 8 --hostfile nodefile3 prog3

Where nodefile1, nodefile2, and nodefile3 are the lists extracted  
from

PBS_NODEFILE.  Is this correct?  Any suggestion/advice, (e.g. syntax
of
the 'nodefiles'), is appreciated.

T. Rosmond



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

[OMPI users] maximum value for count argument

2009-11-10 Thread Martin Siegert

Hi,

I have a problem with sending/receiving large buffers when using
openmpi (version 1.3.3), e.g.,

MPI_Allreduce(sbuf, rbuf, count, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD);

with count=18000 (this problem does not appear to be unique for
Allreduce, but occurs with Reduce, Bcats as well; maybe more).
Initially I thought the maximum value for count would be 2^31-1
because count is an int. However, when using MPICH2 I receive a
segfault already when count=2^31/8 thus I suspect that they transfer
bytes instead of doubles internally and the count for the # of bytes
wraps around at that value. This I can deal with (it is not nice,
but I can wrap all calls such that as soon as count > 268435456
several calls are made).

Hwoever, with openmpi I just cannot figure out what the largest
permitted value is: in most cases the MPI calls hang for
count > 176763240, but this is not completely reproducable. This
appears to depend on the history, i.e., what other MPI routines
have been called before that.
>From looking at the code as far as I understand the MPICH2 problem
should not appear for openmpi: the allreduce call is split up into
several calls anyway - see the loop

for (phase = 0; phase < num_phases; phase ++) {
...
}

in coll_tuned_allreduce.c. In fact that loop is executed just fine.
The "hang" occurs when ompi_coll_tuned_sendrecv is called
(line 839 of coll_tuned_allreduce.c). Here is the call of that function:

(gdb) s
ompi_coll_tuned_sendrecv_actual (sendbuf=0x2aab2d539410, scount=9000, 
sdatatype=0x602530, dest=1, stag=-12, recvbuf=0x2aab02694010, 
rcount=9000, rdatatype=0x602530, source=1, rtag=-12, comm=0x602730, 
status=0x0) at coll_tuned_util.c:41

and the program just hangs as soon as ompi_request_wait_all (line 55 of
coll_tuned_util.c) is executed.

Any ideas how to fix this?

Cheers,
Martin

-- 
Martin Siegert
Head, Research Computing
WestGrid Site Lead
IT Servicesphone: 778 782-4691
Simon Fraser Universityfax:   778 782-4242
Burnaby, British Columbia  email: sieg...@sfu.ca
Canada  V5A 1S6

44 matches

Mail list logo