[OMPI users] Starting on remote nodes

2006-10-25 Thread Katherine Holcomb
We currently use MPICH on our Linux clusters, but due to a high
frequency of semaphore problems we are planning to replace it.  OpenMPI
looks like our best candidate but we have hit a snag.  We support
multiple compilers (specifically PGI and Intel) and due to
incompatibilities in different vendors' f90 .mod files, we have separate
directories for OpenMPI with each compiler.  Therefore we cannot set a
global path to the OpenMPI binaries -- it will differ depending on the
user's choice of compiler.  I have read about the --prefix flag and this
does work, but our users are mostly barely conversant with Unix and many
would have difficulty finding and specifying the appropriate path.  (We
use the modules software environment currently to set paths and the like
for them.)  Is there some way to specify something like "use the same
path as you are in" from the root process?  There was some allusion in
the FAQ to changing the wrappers to include directives, but the link led
to a "no such category" page.
-- 
Katherine Holcomb, Ph.D.kholc...@virginia.edu
Research Computing Support Group - ITC  Office Phone: (434) 982-5948
148 BSEL, Clark HallCenter Phone: (434) 243-8799
University of Virginia 22904



Re: [OMPI users] Starting on remote nodes

2006-10-25 Thread Rainer Keller
Hello dear Katherine,
On Wednesday 25 October 2006 17:43, Katherine Holcomb wrote:
> We currently use MPICH on our Linux clusters, but due to a high
> frequency of semaphore problems we are planning to replace it.  OpenMPI
> looks like our best candidate but we have hit a snag.  We support
> multiple compilers (specifically PGI and Intel) and due to
> incompatibilities in different vendors' f90 .mod files, we have separate
> directories for OpenMPI with each compiler.  Therefore we cannot set a
> global path to the OpenMPI binaries -- it will differ depending on the
> user's choice of compiler.  I have read about the --prefix flag and this
> does work, but our users are mostly barely conversant with Unix and many
> would have difficulty finding and specifying the appropriate path.  (We
> use the modules software environment currently to set paths and the like
> for them.)
If You use modules already, You can set the path / prefix depending on the 
compilers to be used. Additionally, You may provide wrappers to mpirun to 
specify the correct prefix, so that for the user nothing will change, so:

/opt/OpenMPI/bin/... mpi-wrappers calling
/opt/OpenMPI/1.1.2-pgi/...
/opt/OpenMPI/1.1.2-intel/...

We have used this kind of setup before to provide different versions of Open 
MPI.

The next Open MPI-1.2 will provide --enable-orterun-prefix-by-default flag to 
always have the prefix being passed.

Hope, I could help?

With best regards,
Rainer


> Is there some way to specify something like "use the same 
> path as you are in" from the root process?  There was some allusion in
> the FAQ to changing the wrappers to include directives, but the link led
> to a "no such category" page.

-- 

Dipl.-Inf. Rainer Keller   http://www.hlrs.de/people/keller
 High Performance Computing   Tel: ++49 (0)711-685 6 5858
   Center Stuttgart (HLRS)   Fax: ++49 (0)711-685 6 5832
 POSTAL:Nobelstrasse 19 email: kel...@hlrs.de 
 ACTUAL:Allmandring 30, R.O.030AIM:rusraink
 70550 Stuttgart


Re: [OMPI users] Starting on remote nodes

2006-10-25 Thread Michael Kluskens


On Oct 25, 2006, at 11:43 AM, Katherine Holcomb wrote:


...We support
multiple compilers (specifically PGI and Intel) and due to
incompatibilities in different vendors' f90 .mod files, we have  
separate

directories for OpenMPI with each compiler.  Therefore we cannot set a
global path to the OpenMPI binaries -- it will differ depending on the
user's choice of compiler.  I have read about the --prefix flag and  
this
does work, but our users are mostly barely conversant with Unix and  
many
would have difficulty finding and specifying the appropriate path.   
(We
use the modules software environment currently to set paths and the  
like

for them.)  Is there some way to specify something like "use the same
path as you are in" from the root process?  There was some allusion in
the FAQ to changing the wrappers to include directives, but the  
link led

to a "no such category" page.


I presume you left some critical piece of information out of your  
message, like the name and configuration of the batch queueing system  
you are using.


The answer to your question as worded may not be the best answer for  
your problem.


I have dealt with two cases similar to yours:

1) Large system using Modules and LSF batch queueing system -- this  
type of system requires the people configuring LSF to set up some  
stuff or the end users have to use --prefix flag to get the OpenMPI  
path, plus more to get the correct compiler (something I never  
figured out how to do before the LSF admins extended their LSF  
installation to cover OpenMPI). [what stuff I don't know, I'm not an  
LSF admin]


2) Local system I sysadm, learning Modules setup was going to take  
more time than I had available so I wrote a script that sets PATH,  
MANPATH, and LD_LIBRARY_PATH based on similar arguments as the real  
Module software (also G95_INCLUDE_PATH  for g95).  When the user sets  
the environmental variables via my script and then runs OpenMPI I see  
no problems with OpenMPI on the other nodes; however, we don't have a  
batch queuing system.  I don't see why using the Modules software  
would be any different.  One critical piece is that my script also  
aliases mpirun, for example "alias mpirun "mpirun --prefix /opt/g95/ 
openmpi/1.1.1 " (which the real modules software should also be able  
to do if needed) and I have only one installation of each type of  
compiler (g95, Intel, PGI, Absoft).


Michael



Re: [OMPI users] Starting on remote nodes

2006-10-25 Thread Michael Kluskens


On Oct 25, 2006, at 11:43 AM, Katherine Holcomb wrote:


...We support
multiple compilers (specifically PGI and Intel) and due to
incompatibilities in different vendors' f90 .mod files, we have  
separate

directories for OpenMPI with each compiler.  Therefore we cannot set a
global path to the OpenMPI binaries -- it will differ depending on the
user's choice of compiler.  I have read about the --prefix flag and  
this
does work, but our users are mostly barely conversant with Unix and  
many
would have difficulty finding and specifying the appropriate path.   
(We
use the modules software environment currently to set paths and the  
like

for them.)  Is there some way to specify something like "use the same
path as you are in" from the root process?  There was some allusion in
the FAQ to changing the wrappers to include directives, but the  
link led

to a "no such category" page.


I presume you left some critical piece of information out of your  
message, like the name and configuration of the batch queueing system  
you are using.


The answer to your question as worded may not be the best answer for  
your problem.


I have dealt with two cases similar to yours:

1) Large system using Modules and LSF batch queueing system -- this  
type of system requires the people configuring LSF to set up some  
stuff or the end users have to use --prefix flag to get the OpenMPI  
path, plus more to get the correct compiler (something I never  
figured out how to do before the LSF admins extended their LSF  
installation to cover OpenMPI). [what stuff I don't know, I'm not an  
LSF admin]


2) Local system I sysadm, learning Modules setup was going to take  
more time than I had available so I wrote a script that sets PATH,  
MANPATH, and LD_LIBRARY_PATH based on similar arguments as the real  
Module software (also G95_INCLUDE_PATH  for g95).  When the user sets  
the environmental variables via my script and then runs OpenMPI I see  
no problems with OpenMPI on the other nodes; however, we don't have a  
batch queuing system.  I don't see why using the Modules software  
would be any different.  One critical piece is that my script also  
aliases mpirun, for example "alias mpirun "mpirun --prefix /opt/g95/ 
openmpi/1.1.1 " (which the real modules software should also be able  
to do if needed) and I have only one installation of each type of  
compiler (g95, Intel, PGI, Absoft).


Michael



Re: [OMPI users] Starting on remote nodes

2006-10-25 Thread Katherine Holcomb

> 
> I presume you left some critical piece of information out of your  
> message, like the name and configuration of the batch queueing system  
> you are using.

We're using PBS Pro although I don't think it's a factor in this
particular situation.  (I did find some behavior with PBS Pro that
seemed not to be as advertised, i.e. it was placing all the processes on
one node when two were requested unless the -machinefile flag was
explicitly set to $PBS_NODEFILE, but that was a different problem.)

> 
> The answer to your question as worded may not be the best answer for  
> your problem.
> 
> I have dealt with two cases similar to yours:
> 
> 1) Large system using Modules and LSF batch queueing system -- this  
> type of system requires the people configuring LSF to set up some  
> stuff or the end users have to use --prefix flag to get the OpenMPI  
> path, plus more to get the correct compiler (something I never  
> figured out how to do before the LSF admins extended their LSF  
> installation to cover OpenMPI). [what stuff I don't know, I'm not an  
> LSF admin]

It does look like we'll have to use the --prefix flag, at least to
start.  Rainer Keller pointed out that I can set an environment variable
in the module script and that does seem to be the best option for now.
We'd rather not get into wrapping the binaries.

> 
> 2) Local system I sysadm, learning Modules setup was going to take  
> more time than I had available so I wrote a script that sets PATH,  
> MANPATH, and LD_LIBRARY_PATH based on similar arguments as the real  
> Module software (also G95_INCLUDE_PATH  for g95).  When the user sets  
> the environmental variables via my script and then runs OpenMPI I see  
> no problems with OpenMPI on the other nodes; however, we don't have a  
> batch queuing system.  I don't see why using the Modules software  
> would be any different.  One critical piece is that my script also  
> aliases mpirun, for example "alias mpirun "mpirun --prefix /opt/g95/ 
> openmpi/1.1.1 " (which the real modules software should also be able  
> to do if needed) and I have only one installation of each type of  
> compiler (g95, Intel, PGI, Absoft).

Long term we are probably going to do something similar (write our own
Modules replacement).  For one thing, the Modules software doesn't seem
to have been maintained for a while, and for another, it uses Tcl, which
is not much of a mainstream language anymore.

-- 
Katherine Holcomb, Ph.D.kholc...@virginia.edu
Research Computing Support Group - ITC  Office Phone: (434) 982-5948
148 BSEL, Clark HallCenter Phone: (434) 243-8799
University of Virginia 22904



[OMPI users] MPI_REDUCE vs. MPI_IN_PLACE vs. F90 Interfaces

2006-10-25 Thread Michael Kluskens
Yet another forgotten issue regarding the f90 large interfaces  

(note that MPI_IN_PLACE is currently an integer, for a time it was a  
double complex but that has been fixed).


Problem I have now is that my patches which worked with 1.2 don't  
work with 1.3.  I've tried various fixes for my patches and I don't  
have a solution like I have for MPI_Gather.


Michael


Consider

 call MPI_REDUCE 
(MPI_IN_PLACE,sumpfi,sumpfmi,MPI_INTEGER,MPI_SUM, 0,allmpi,ier)


Error: Generic subroutine 'mpi_reduce' at (1) is not consistent with  
a specific subroutine interface


sumpfi is an integer array, sumpfmi is an integer.

The problem is that MPI_IN_PLACE is an integer, so you can only
compile with the large interface file when the second argument of
MPI_REDUCE is an integer, not an integer array, or a character, or a
logical, ...

So this doubles the number of f90 interfaces needed for MPI_REDUCE
(and anything else that uses MPI_IN_PLACE).
-