Re: [OMPI users] Issue with PBS Pro

2009-01-30 Thread Kiril Dichev
Hi,

I did a few things wrong before:

1. The new name of the component "pls" is "plm".
2. It seems for the components, now a ":" separation is used instead of
a "-" separation.

Anyway, for me specifying "--enable-mca-static=plm:tm" seems to fix the
problem - I still have shared libraries for Open MPI with statically
compiled Torque support.

Cheers,
Kiril

On Thu, 2009-01-29 at 12:37 -0700, Ralph Castain wrote:
> On a Torque system, your job is typically started on a backend node.  
> Thus, you need to have the Torque libraries installed on those nodes -  
> or else build OMPI static, as you found.
> 
> I have never tried --enable-mca-static, so I have no idea if this  
> works or what it actually does. If I want static, I just build the  
> entire tree that way.
> 
> If you want to run dynamic, though, you'll have to make the Torque  
> libs available on the backend nodes.
> 
> Ralph
> 
> 
> On Jan 29, 2009, at 8:32 AM, Kiril Dichev wrote:
> 
> > Hi,
> >
> > I am trying to run with Open MPI 1.3 on a cluster using PBS Pro:
> >
> > pbs_version = PBSPro_9.2.0.81361
> >
> >
> > However, after compiling with these options:
> >
> > ../configure
> > --prefix=/home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3- 
> > intel10.1-64bit-dynamic-threads CC=/opt/intel/cce/10.1.015/bin/icc  
> > CXX=/opt/intel/cce/10.1.015/bin/icpc CPP="/opt/intel/cce/10.1.015/ 
> > bin/icc -E" FC=/opt/intel/fce/10.1.015/bin/ifort F90=/opt/intel/fce/ 
> > 10.1.015/bin/ifort F77=/opt/intel/fce/10.1.015/bin/ifort --enable- 
> > mpi-f90 --with-tm=/usr/pbs/ --enable-mpi-threads=yes --enable- 
> > contrib-no-build=vt
> >
> > I get runtime errors when running on more than one reserved node
> > even /bin/hostname:
> >
> > /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit- 
> > dynamic-threads/bin/mpirun  -np 5  /bin/hostname
> > /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit- 
> > dynamic-threads/bin/mpirun: symbol lookup error: /home_nfs/parma/ 
> > x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit-dynamic-threads/ 
> > lib/openmpi/mca_plm_tm.so: undefined symbol: tm_init
> >
> > When running on one node only, I don't get this error.
> >
> > Now, I see that I only have static PBS libraries so I tried to compile
> > this component statically. I added to the above configure:
> > "--enable-mca-static=ras-tm,pls-tm"
> >
> > However, nothing changed. The same errors occurr.
> >
> >
> > But if I compile Open MPI only with static libraries ("--enable-static
> > --disable-shared"), the MPI (or non-MPI) programs run OK.
> >
> > Can you help me here ?
> >
> > Thanks,
> > Kiril
> >
> >
> >
> > -- 
> > Dipl.-Inf. Kiril Dichev
> > Tel.: +49 711 685 60492
> > E-mail: dic...@hlrs.de
> > High Performance Computing Center Stuttgart (HLRS)
> > Universität Stuttgart
> > 70550 Stuttgart
> > Germany
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
-- 
Dipl.-Inf. Kiril Dichev
Tel.: +49 711 685 60492
E-mail: dic...@hlrs.de
High Performance Computing Center Stuttgart (HLRS)
Universität Stuttgart
70550 Stuttgart
Germany




Re: [OMPI users] openmpi over tcp

2009-01-30 Thread Peter Kjellstrom
On Thursday 29 January 2009, Ralph Castain wrote:
> It is quite likely that you have IPoIB on your system. In that case,
> the TCP BTL will pickup that interface and use it.
...

Sub 3us latency rules out IPoIB for sure. The test below ran on native IB or 
some other very low latency path.

> > # OSU MPI Latency Test v3.1.1
> > # SizeLatency (us)
> > 0 2.41
> > 1 2.66
> > 2 2.85

/Peter


signature.asc
Description: This is a digitally signed message part.


[OMPI users] open-mpi.org website down?

2009-01-30 Thread Sean Davis
Is the open-mpi.org site down, or do I have a local problem with access?

Thanks,
Sean


Re: [OMPI users] open-mpi.org website down?

2009-01-30 Thread Ralph Castain

It is briefly down for some maintenance this morning


On Jan 30, 2009, at 7:14 AM, Sean Davis wrote:

Is the open-mpi.org site down, or do I have a local problem with  
access?


Thanks,
Sean

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] open-mpi.org website down?

2009-01-30 Thread Jeff Squyres
FWIW, I usually announce short outages like that on the devel list;  
this one was supposed to only be 30 minutes this morning but they  
apparently ran into some problems when trying to bring the machine back.


There still appears to be a residual problem that the server is up  
(e.g., mail is flowing) but I cannot connect to http://www.open- 
mpi.org/.  I've pinged the sysadmins to make sure they know.   
Hopefully, it'll be back soon.



On Jan 30, 2009, at 10:58 AM, Ralph Castain wrote:


It is briefly down for some maintenance this morning


On Jan 30, 2009, at 7:14 AM, Sean Davis wrote:

Is the open-mpi.org site down, or do I have a local problem with  
access?


Thanks,
Sean

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



[OMPI users] Fwd: [GE users] Open MPI job fails when run thru SGE

2009-01-30 Thread Sangamesh B
Dear Open MPI,

Do you have a solution for the following problem of Open MPI (1.3)
when run through Grid Engine.

I changed global execd params with H_MEMORYLOCKED=infinity and
restarted the sgeexecd in all nodes.

But still the problem persists:

 $cat err.77.CPMD-OMPI
ssh_exchange_identification: Connection closed by remote host
--
A daemon (pid 31947) died unexpectedly with status 129 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--
--
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--
ssh_exchange_identification: Connection closed by remote host
--
mpirun was unable to cleanly terminate the daemons on the nodes shown
below. Additional manual cleanup may be required - please refer to
the "orte-clean" tool for assistance.
--
   node-0-19.local - daemon did not report back when launched
   node-0-20.local - daemon did not report back when launched
   node-0-21.local - daemon did not report back when launched
   node-0-22.local - daemon did not report back when launched

The hostnames for infiniband interfaces are ibc0, ibc1, ibc2 .. ibc23.
May be Open MPI is not able to identify hosts as it is using node-0-..
. Is this causing open mpi to fail?

Thanks,
Sangamesh


On Mon, Jan 26, 2009 at 5:09 PM, mihlon  wrote:
> Hi,
>
>> Hello SGE users,
>>
>> The cluster is installed with Rocks-4.3, SGE 6.0 & Open MPI 1.3.
>> Open MPI is configured with "--with-sge".
>> ompi_info shows only one component:
>> # /opt/mpi/openmpi/1.3/intel/bin/ompi_info | grep gridengine
>> MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.3)
>>
>> Is this acceptable?
> maybe yes
>
> see: http://www.open-mpi.org/faq/?category=building#build-rte-sge
>
> shell$ ompi_info | grep gridengine
> MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.3)
> MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.3)
>
> (Specific frameworks and version numbers may vary, depending on your
> version of Open MPI.)
>
>> The Open MPI parallel jobs run successfully through command line, but
>> fail when run thru SGE(with -pe orte ).
>>
>> The error is:
>>
>> $ cat err.26.Helloworld-PRL
>> ssh_exchange_identification: Connection closed by remote host
>> --
>> A daemon (pid 8462) died unexpectedly with status 129 while attempting
>> to launch so we are aborting.
>>
>> There may be more information reported by the environment (see above).
>>
>> This may be because the daemon was unable to find all the needed shared
>> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
>> location of the shared libraries on the remote nodes and this will
>> automatically be forwarded to the remote nodes.
>> --
>> --
>> mpirun noticed that the job aborted, but has no info as to the process
>> that caused that situation.
>> --
>> mpirun: clean termination accomplished
>>
>> But the same job runs well, if it runs on a single node but with an error:
>>
>> $ cat err.23.Helloworld-PRL
>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>> This will severely limit memory registrations.
>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>> This will severely limit memory registrations.
>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>> This will severely limit memory registrations.
>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>> This will severely limit memory registrations.
>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>> This will severely limit memory registrations.
>> --
>> WARNING: There was an error initializing an OpenFabrics device.
>>
>> Local host: node-0-4.local
>> Local device: mthca0
>> --
>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>> This will severely limit memory registrations.
>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>> This will severely lim

Re: [OMPI users] open-mpi.org website down?

2009-01-30 Thread Sean Davis
On Fri, Jan 30, 2009 at 11:13 AM, Jeff Squyres  wrote:

> FWIW, I usually announce short outages like that on the devel list; this
> one was supposed to only be 30 minutes this morning but they apparently ran
> into some problems when trying to bring the machine back.
>
> There still appears to be a residual problem that the server is up (e.g.,
> mail is flowing) but I cannot connect to http://www.open-mpi.org/.  I've
> pinged the sysadmins to make sure they know.  Hopefully, it'll be back soon.
>

Thanks, Jeff.

Sean


>
>
>
> On Jan 30, 2009, at 10:58 AM, Ralph Castain wrote:
>
>  It is briefly down for some maintenance this morning
>>
>>
>> On Jan 30, 2009, at 7:14 AM, Sean Davis wrote:
>>
>>  Is the open-mpi.org site down, or do I have a local problem with access?
>>>
>>> Thanks,
>>> Sean
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] open-mpi.org website down?

2009-01-30 Thread Jeff Squyres

Ok, it appears to be back now.  Sorry for the interruption...

On Jan 30, 2009, at 11:19 AM, Sean Davis wrote:




On Fri, Jan 30, 2009 at 11:13 AM, Jeff Squyres   
wrote:
FWIW, I usually announce short outages like that on the devel list;  
this one was supposed to only be 30 minutes this morning but they  
apparently ran into some problems when trying to bring the machine  
back.


There still appears to be a residual problem that the server is up  
(e.g., mail is flowing) but I cannot connect to http://www.open-mpi.org/ 
.  I've pinged the sysadmins to make sure they know.  Hopefully,  
it'll be back soon.


Thanks, Jeff.

Sean




On Jan 30, 2009, at 10:58 AM, Ralph Castain wrote:

It is briefly down for some maintenance this morning


On Jan 30, 2009, at 7:14 AM, Sean Davis wrote:

Is the open-mpi.org site down, or do I have a local problem with  
access?


Thanks,
Sean

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Fwd: [GE users] Open MPI job fails when run thru SGE

2009-01-30 Thread Reuti

Am 30.01.2009 um 15:02 schrieb Sangamesh B:


Dear Open MPI,

Do you have a solution for the following problem of Open MPI (1.3)
when run through Grid Engine.

I changed global execd params with H_MEMORYLOCKED=infinity and
restarted the sgeexecd in all nodes.

But still the problem persists:

 $cat err.77.CPMD-OMPI
ssh_exchange_identification: Connection closed by remote host


I think this might already be the reason why it's not working. A  
mpihello program is running fine through SGE?


-- Reuti


-- 

A daemon (pid 31947) died unexpectedly with status 129 while  
attempting

to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed  
shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to  
have the

location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
-- 

-- 


mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
-- 


ssh_exchange_identification: Connection closed by remote host
-- 


mpirun was unable to cleanly terminate the daemons on the nodes shown
below. Additional manual cleanup may be required - please refer to
the "orte-clean" tool for assistance.
-- 


   node-0-19.local - daemon did not report back when launched
   node-0-20.local - daemon did not report back when launched
   node-0-21.local - daemon did not report back when launched
   node-0-22.local - daemon did not report back when launched

The hostnames for infiniband interfaces are ibc0, ibc1, ibc2 .. ibc23.
May be Open MPI is not able to identify hosts as it is using node-0-..
. Is this causing open mpi to fail?

Thanks,
Sangamesh


On Mon, Jan 26, 2009 at 5:09 PM, mihlon  wrote:

Hi,


Hello SGE users,

The cluster is installed with Rocks-4.3, SGE 6.0 & Open MPI 1.3.
Open MPI is configured with "--with-sge".
ompi_info shows only one component:
# /opt/mpi/openmpi/1.3/intel/bin/ompi_info | grep gridengine
MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.3)

Is this acceptable?

maybe yes

see: http://www.open-mpi.org/faq/?category=building#build-rte-sge

shell$ ompi_info | grep gridengine
MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.3)
MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.3)

(Specific frameworks and version numbers may vary, depending on your
version of Open MPI.)

The Open MPI parallel jobs run successfully through command line,  
but

fail when run thru SGE(with -pe orte ).

The error is:

$ cat err.26.Helloworld-PRL
ssh_exchange_identification: Connection closed by remote host
 
--
A daemon (pid 8462) died unexpectedly with status 129 while  
attempting

to launch so we are aborting.

There may be more information reported by the environment (see  
above).


This may be because the daemon was unable to find all the needed  
shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to  
have the

location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
 
--
 
--
mpirun noticed that the job aborted, but has no info as to the  
process

that caused that situation.
 
--

mpirun: clean termination accomplished

But the same job runs well, if it runs on a single node but with  
an error:


$ cat err.23.Helloworld-PRL
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
 
--

WARNING: There was an error initializing an OpenFabrics device.

Local host: node-0-4.local
Local device: mthca0
 
--

libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warni

Re: [OMPI users] Using compiler_args in wrappers with Portland Group Compilers

2009-01-30 Thread Brian W. Barrett

On Thu, 29 Jan 2009, Wayne Gilmore wrote:


I'm trying to use the compiler_args field in the wrappers script to deal
with 32 bit compiles on our cluster.

I'm using portland group compilers and use the following for 32 bit
builds:  -tp p7 (I actually tried to use -tp x32 but it does not compile
correctly.  I think it has something to do with how the atomic operations are 
defined)


I've created a separate stanza in the wrapper but I am not able to use
the whole option "-tp p7" for the compiler_args.  It only works if I do
compiler_args=p7

Is there a way to provide compiler_args with arguments that contain a space?

This would eliminate cases where 'p7' would appear elsewhere in the compile 
line and be falsely recognized as a 32 bit build.


You should be able to include options just as you want them to appear on 
the command line.  Can you send me both the .txt file you edited as well 
as the output of mpicc -showme (or whichever compiler you are testing)?


Thanks,

Brian


Re: [OMPI users] Rmpi and LAM

2009-01-30 Thread Brian W. Barrett

On Thu, 29 Jan 2009, Paul Wardman wrote:


I'm using R on a Ubuntu 8.10 machine, and, in particular, quite a lot of
papply calls to analyse data. I'm currently using the LAM implementation,
as it's the only one I've got to work properly. However, while it works
fine on one PC, it fails with the error message

Error in mpi.comm.spawn(slave = system.file("Rslaves.sh", package =
"Rmpi"),  :
  MPI_Error_string: error spawning process

when I try to run it over a network on two machines. However, I've got
passwordless ssh working fine, and the lamnodes command seems to suggest
I've got all the nodes up and running fine (the other computer is also
Ubuntu 8.10) and lamhosts() from within R shows all the nodes perfectly
well. I've even got mpirun to work on both machines.

Can anyone help with (A) getting my current setup with R to work and / or
(B) suggestions for getting OpenMPI to work at all! (and preferably on
multiple machines).


For help with LAM/MPI, I'd suggest posting to the LAM mailing list, 
although I don't think we'll be able to help you much, since it looks like 
R's MPI package ends up eating the useful error message.  It might be 
useful to ask the developers of that package if they've seen such problems 
before.


For Open MPI, you're going to have to provide a bit more detail (like what 
doesn't work!).


Brian

Re: [OMPI users] Pinned memory

2009-01-30 Thread Brian W. Barrett

On Thu, 29 Jan 2009, Gabriele Fatigati wrote:


Dear OpenMPI Developer,
i have a doubt regards mpi_leave_pinned parameter. Suppose i have a simple for:

for( int i=0; i< 100; i++)
 MPI_Reduce(a, b, ...)

My question is: if i set mpi_leave_pinned= 1, buffer memories are
pinned in the entire process, or just into the for cycle?

When the cycle is finished, a and b memories, are unregistrered?


When mpi_leave_pinned is set to 1, any memory used in long message 
communication is pinned until it is released back to the OS, the MPI has 
to unpin it in order to pin some other memory, or until MPI_Finalize, 
whichever comes first.


Note that a and b might never have been pinned in your example, if they 
are short buffers, as the copy protocol is always used for short messages.



Brian


Re: [OMPI users] Open MPI 1.3 segfault on amd64 with Rmpi

2009-01-30 Thread Jeff Squyres

On Jan 26, 2009, at 3:33 PM, Dirk Eddelbuettel wrote:

Gdb doesn't want to step into the Open MPI code; I used debugging  
symbols for
both R and Open MPI that are available via -dbg packages with the  
debugging
info.  So descending one function at a time, I see the following  
calling

sequence

 MPI_Init
 ompi_mpi_init
 orte_init
 opal_init
 opal_paffinity_base_open
 mca_base_components_open
 open_components

where things end in the loop over oapl_list() elements.  I still see a
fprintf() statment just before

  if (MCA_SUCCESS == component->mca_register_component_params()) {

in the middle of the open_components function in the file
mca_base_components_open.c


Do you know if component is non-NULL and has a sensible value (i.e.,  
pointing to a valid component)?


Does ompi_info work?  (ompi_info uses this exact same code to find/ 
open components)  If ompi_info fails, you should be able to attach a  
debugger to that, since it's a serial and [relatively] straightforward  
app.


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Openmpi 1.3 problems with libtool-ltdl on CentOS 4 and 5

2009-01-30 Thread Roy Dragseth
On Friday 23 January 2009 15:31:59 Jeff Squyres wrote:
> Ew.  Yes, I can see this being a problem.
>
> I'm guessing that the real issue is that OMPI embeds the libltdl from
> LT 2.2.6a inside libopen_pal (one of the internal OMPI libraries).
> Waving my hands a bit, but it's not hard to imagine some sort of clash
> is going on between the -lltdl you added to the command line and the
> libltdl that is embedded in OMPI's libraries.
>
> Can you verify that this is what is happening?

Hi, sorry for the delay.

I'm not very familiar with the workings of ltdl, I got this from one of our 
users.  Would you suggest that if one use openmpi 1.3 and ltdl you should not 
explicitly link with -lltdl?  At least this seems to work correctly with the 
example I posted. That is, I can link the program without specifying -lltdl so 
the symbol seems to resolve to something in the openmpi libraries and the 
example runs without crashing.

Regards,
r.

-- 
  The Computer Center, University of Tromsø, N-9037 TROMSØ Norway.
  phone:+47 77 64 41 07, fax:+47 77 64 41 00
 Roy Dragseth, Team Leader, High Performance Computing
 Direct call: +47 77 64 62 56. email: roy.drags...@uit.no



Re: [OMPI users] Open MPI 1.3 segfault on amd64 with Rmpi

2009-01-30 Thread Dirk Eddelbuettel

On 30 January 2009 at 16:15, Jeff Squyres wrote:
| On Jan 26, 2009, at 3:33 PM, Dirk Eddelbuettel wrote:
| 
| > Gdb doesn't want to step into the Open MPI code; I used debugging  
| > symbols for
| > both R and Open MPI that are available via -dbg packages with the  
| > debugging
| > info.  So descending one function at a time, I see the following  
| > calling
| > sequence
| >
| >  MPI_Init
| >  ompi_mpi_init
| >  orte_init
| >  opal_init
| >  opal_paffinity_base_open
| >  mca_base_components_open
| >  open_components
| >
| > where things end in the loop over oapl_list() elements.  I still see a
| > fprintf() statment just before
| >
| >   if (MCA_SUCCESS == component->mca_register_component_params()) {
| >
| > in the middle of the open_components function in the file
| > mca_base_components_open.c
| 
| Do you know if component is non-NULL and has a sensible value (i.e.,  
| pointing to a valid component)?

Do not. Everything (in particular below /etc/openmpi/) is at default values
with the sole exception of 

# edd 18 Dec 2008
mca_component_show_load_errors = 0

Could that kill it?  [ Goes off and tests... ] No, still dies with segfault
in open_components.

| Does ompi_info work?  (ompi_info uses this exact same code to find/ 
| open components)  If ompi_info fails, you should be able to attach a  
| debugger to that, since it's a serial and [relatively] straightforward  
| app.

Yes, ompi_info happily runs and returns around 111 lines. It seems to loop
over around 25 mca components.

Open MPI is otherwise healthy and happy.  It's just that Rmpi does not get
along with Open MPI 1.3  but this happens to be my personal use-case :-/

Dirk

-- 
Three out of two people have difficulties with fractions.


[OMPI users] Installing OpenMPI for Intel Fortran on OSX??

2009-01-30 Thread Goldstein, Bruce E
I OSX 10.5, have Intel Fortran, but I do not have Intel C, so need to compile 
OpenMPI using gcc, which is supposed to work.
When I run the configure script I get an error message about
configure: error: C and Fortran 77 compilers are not link compatible.  Can not 
continue.

I am guessing that the problem is that the Intel Fortran is compiling for a 64 
bit architecture, and gcc is by default compiling for
32 bit.  I tried setting an environment variable CFLAGS=-m64, but this did not 
help.  I am really a novice at passing compiler
flags to things inside scripts, and also, not knowing these flags, and, also, 
maybe I tried to pass the wrong flag value.  I would appreciate it
if anyone who knows what to do could provide a very specific recipe for exactly 
what I need to do to fix this problem.
Thanks very much.  -bg


Re: [OMPI users] Installing OpenMPI for Intel Fortran on OSX??

2009-01-30 Thread Goldstein, Bruce E
Please ignore request for help on OpenMPI and Intel Fortran a few minutes ago, 
the magic incantation was in the FAQ on building, but I had
been elsewhere in the FAQ. -bg


Re: [OMPI users] Using compiler_args in wrappers with Portland Group Compilers

2009-01-30 Thread Wayne Gilmore

>You should be able to include options just as you want them to appear on
>the command line. Can you send me both the .txt file you edited as well
>as the output of mpicc -showme (or whichever compiler you are testing)?
>
>Thanks,
>
>Brian

For a regular 64 bit build:
(no problems here, works fine)

katana:~ % mpicc --showme
pgcc -D_REENTRANT
-I/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/include
-Wl,-rpath
-Wl,/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/lib
-L/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/lib
-lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil
-lpthread -ldl

For a 32 bit build when compiler_args is set to "-tp p7" in the wrapper:
(note that in this case is does not pick up the lib32 and include32 dirs)

katana:share/openmpi % mpicc -tp p7 --showme
pgcc -D_REENTRANT
-I/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/include
-tp p7 -Wl,-rpath
-Wl,/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/lib
-L/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/lib
-lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil
-lpthread -ldl

For a 32 bit build when compiler_args is set to "p7" in the wrapper
(note that in this case it does pick up the lib32 and include32 dirs)

katana:share/openmpi % mpicc -tp p7 --showme
pgcc -D_REENTRANT
-I/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/include32 

-I/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/include32 


-tp p7 -Wl,-rpath
-Wl,/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/lib32
-L/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/lib32
-lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil
-lpthread -ldl

Here's the mpicc-wrapper-data.txt file that I am using: (with
compiler_args set to "p7" only.  This works, but if I set it to "-tp p7"
it fails to pick up the info in the stanza)

compiler_args=
project=Open MPI
project_short=OMPI
version=1.3
language=C
compiler_env=CC
compiler_flags_env=CFLAGS
compiler=pgcc
extra_includes=
preprocessor_flags=-D_REENTRANT
compiler_flags=
linker_flags=-Wl,-rpath
-Wl,/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/lib
libs=-lmpi -lopen-rte -lopen-pal   -ldl   -Wl,--export-dynamic -lnsl
-lutil -lpthread -ldl
required_file=
includedir=${includedir}
libdir=${libdir}

compiler_args=p7
project=Open MPI
project_short=OMPI
version=1.3
language=C
compiler_env=CC
compiler_flags_env=CFLAGS
compiler=pgcc
extra_includes=
preprocessor_flags=-D_REENTRANT
-I/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/include32
compiler_flags=
linker_flags=-Wl,-rpath
-Wl,/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/lib32
libs=-lmpi -lopen-rte -lopen-pal   -ldl   -Wl,--export-dynamic -lnsl
-lutil -lpthread -ldl
required_file=
includedir=/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/include32
libdir=/project/scv/waygil/local/IT/ofedmpi-1.2.5.5/mpi/pgi/openmpi-1.3/lib32