Re: [OMPI users] OpenMPI Giving problems when using -mca btl mx, sm, self

2007-09-29 Thread Hammad Siddiqi

Hi Terry,

Thanks for replying. The following command is working fine:

/opt/SUNWhpc/HPC7.0/bin/mpirun -np 4 -mca btl tcp,sm,self  -machinefile 
machines ./hello


The contents of machines are:
indus1
indus2
indus3
indus4

I have tried using np=2 over pairs of machines, but the problem is same.
The errors that occur are given below with the command that I am trying.

**Test 1**

/opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self  -host 
"indus1,indus2" ./hello

--
Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--
--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

 PML add procs failed
 --> Returned "Unreachable" (-12) instead of "Success" (0)
--
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--
Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--
--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

 PML add procs failed
 --> Returned "Unreachable" (-12) instead of "Success" (0)
--
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)

**Test 2*

*/opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self  -host 
"indus1,indus3" ./hello

--
Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--
--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

 PML add procs failed
 --> Returned "Unreachable" (-12) instead of "Success" (0)
--
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--
Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--
--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

 PML add procs failed
 --> Returned "Unreachable" (-12) instead of "Success" (0)
--
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
*
*Test 3*
*/opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self  -host 
"indus1,indus4" ./hello

--
Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
If you specified the

Re: [OMPI users] Open MPI on 64 bits intel Mac OS X

2007-09-29 Thread Massimo Cafaro

Brian,

thank you very much for your suggestion. I have succesfully  
recompiled Open MPI for 64 bits and it works like a charm. Anyway, it  
would be nice to have this option available as a configure switch.


Cheers,
Massimo

On Sep 28, 2007, at 3:28 PM, Brian Barrett wrote:



On Sep 28, 2007, at 4:56 AM, Massimo Cafaro wrote:


Dear all,

when I try to compile my MPI code on 64 bits intel Mac OS X the
build fails since the Open MPI library has been compiled using 32
bits. Can you please provide in the next version the ability at
configure time to choose between 32 and 64 bits or even better
compile by defaults using both modes?

To reproduce the problem, simply compile on 64 bits intel Mac OS X
an MPI application using mpicc -arch x86_64. The 64 bits linker
complains as follows:

ld64 warning: in /usr/local/mpi/lib/libmpi.dylib, file is not of
required architecture
ld64 warning: in /usr/local/mpi/lib/libopen-rte.dylib, file is not
of required architecture
ld64 warning: in /usr/local/mpi/lib/libopen-pal.dylib, file is not
of required architecture

and a number of undefined symbols is shown, one for each MPI
function used in the application.


This is already possible.  Simply use the configure options:

   ./configure ... CFLAGS="-arch x86_64" CXXFLAGS="-arch x86_64"
OBJCFLAGS="-arch x86_64"

also set FFLAGS and FCFLAGS to "-m64" if you have gfortran/g95
compiler installed.  The common installs of either don't speak the -
arch option, so you have to use the more traditional -m64.

Hope this helps,

Brian
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--

 
***


 Massimo Cafaro, Ph.D. National  
Nanotechnology Laboratory (NNL/CNR-INFM)
 Assistant ProfessorEuro- 
Mediterranean Centre for Climate Change

 Dept. of Engineering for InnovationSPACI Consortium
 University of Salento, Lecce, Italy
 Via per Monteroni  Voice  +39  
0832 297371 Fax +39 0832 298173
 73100 Lecce, Italy Web  
http://sara.unile.it/~cafaro

 E-mail massimo.caf...@unile.it caf...@cacr.caltech.edu

 
***





PGP.sig
Description: This is a digitally signed message part


Re: [OMPI users] OpenMPI Giving problems when using -mca btl mx, sm, self

2007-09-29 Thread Tim Prins
I would reccommend trying a few things:

1. Set some debugging flags and see if that helps. So, I would try something 
like:
/opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl 
mx,self  -host "indus1,indus2" -mca btl_base_debug 1000 ./hello

This will output information as each btl is loaded, and whether or not the 
load succeeds.

2. Try running with the mx mtl instead of the btl:
/opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca mtl mx -host "indus1,indus2" ./hello

Similarly, for debug output:
/opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca mtl mx -host "indus1,indus2" -mca 
mtl_base_debug 1000 ./hello

Let me know if any of these work.

Thanks,

Tim

On Saturday 29 September 2007 01:53:06 am Hammad Siddiqi wrote:
> Hi Terry,
>
> Thanks for replying. The following command is working fine:
>
> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 4 -mca btl tcp,sm,self  -machinefile
> machines ./hello
>
> The contents of machines are:
> indus1
> indus2
> indus3
> indus4
>
> I have tried using np=2 over pairs of machines, but the problem is same.
> The errors that occur are given below with the command that I am trying.
>
> **Test 1**
>
> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self  -host
> "indus1,indus2" ./hello
> --
> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --
> --
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
>   PML add procs failed
>   --> Returned "Unreachable" (-12) instead of "Success" (0)
> --
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (goodbye)
> --
> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --
> --
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
>   PML add procs failed
>   --> Returned "Unreachable" (-12) instead of "Success" (0)
> --
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (goodbye)
>
> **Test 2*
>
> */opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self  -host
> "indus1,indus3" ./hello
> --
> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --
> --
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
>   PML add procs failed
>   --> Returned "Unreachable" (-12) instead of "Success" (0)
> --
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (goodbye)
> --
> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --
> --
> 

Re: [OMPI users] OpenMPI Giving problems when using -mca btl mx, sm, self

2007-09-29 Thread Tim Mattox
To use Tim Prins 2nd suggestion, you would also need to add "-mca pml cm" to
the runs with "-mca mtl mx".

On 9/29/07, Tim Prins  wrote:
> I would reccommend trying a few things:
>
> 1. Set some debugging flags and see if that helps. So, I would try something
> like:
> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl
> mx,self  -host "indus1,indus2" -mca btl_base_debug 1000 ./hello
>
> This will output information as each btl is loaded, and whether or not the
> load succeeds.
>
> 2. Try running with the mx mtl instead of the btl:
> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca mtl mx -host "indus1,indus2" ./hello
>
> Similarly, for debug output:
> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca mtl mx -host "indus1,indus2" -mca
> mtl_base_debug 1000 ./hello
>
> Let me know if any of these work.
>
> Thanks,
>
> Tim
>
> On Saturday 29 September 2007 01:53:06 am Hammad Siddiqi wrote:
> > Hi Terry,
> >
> > Thanks for replying. The following command is working fine:
> >
> > /opt/SUNWhpc/HPC7.0/bin/mpirun -np 4 -mca btl tcp,sm,self  -machinefile
> > machines ./hello
> >
> > The contents of machines are:
> > indus1
> > indus2
> > indus3
> > indus4
> >
> > I have tried using np=2 over pairs of machines, but the problem is same.
> > The errors that occur are given below with the command that I am trying.
> >
> > **Test 1**
> >
> > /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self  -host
> > "indus1,indus2" ./hello
> > --
> > Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
> > If you specified the use of a BTL component, you may have
> > forgotten a component (such as "self") in the list of
> > usable components.
> > --
> > --
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration or environment
> > problems.  This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> >
> >   PML add procs failed
> >   --> Returned "Unreachable" (-12) instead of "Success" (0)
> > --
> > *** An error occurred in MPI_Init
> > *** before MPI was initialized
> > *** MPI_ERRORS_ARE_FATAL (goodbye)
> > --
> > Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
> > If you specified the use of a BTL component, you may have
> > forgotten a component (such as "self") in the list of
> > usable components.
> > --
> > --
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration or environment
> > problems.  This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> >
> >   PML add procs failed
> >   --> Returned "Unreachable" (-12) instead of "Success" (0)
> > --
> > *** An error occurred in MPI_Init
> > *** before MPI was initialized
> > *** MPI_ERRORS_ARE_FATAL (goodbye)
> >
> > **Test 2*
> >
> > */opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self  -host
> > "indus1,indus3" ./hello
> > --
> > Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
> > If you specified the use of a BTL component, you may have
> > forgotten a component (such as "self") in the list of
> > usable components.
> > --
> > --
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration or environment
> > problems.  This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> >
> >   PML add procs failed
> >   --> Returned "Unreachable" (-12) instead of "Success" (0)
> > --
> > *** An error occurred in MPI_Init
> > *** before MPI was initialized
> > *** MPI_ERRORS_ARE_FATAL (goodbye)
> > --
> > Process 

[OMPI users] Make error - MacOSX, Intel v10 compilers and Xgrid MCA

2007-09-29 Thread James Conway
(I sent this already, but it didn't appear on the list. The tar- 
gzipped output from configure and make was over 100kB, so I am  
sending again without that attached).


It seems that the XGrid MCA with OpenMPI 1.2.4 does not compile on a  
Mac/Intel system using the latest Intel compilers (seems to be OK  
with gcc). I downloaded the latest (Intel v10 20070809) C/C++ and  
Fortran demos and get the following error when building OpenMPI  
(output from configure and make are available but possibly too large  
for the mailing list):


./configure CC=icc CXX=icpc F77=ifort F90=ifort
[...ok...]
make all
[...]
/bin/sh ../../../../libtool   --mode=link gcc  -g -O2 -module -avoid- 
version -framework XGridFoundation -framework Foundation -export- 
dynamic   -Wl,-u,_munmap -Wl,-multiply_defined,suppress -o  
mca_pls_xgrid.la -rpath /usr/local/lib/openmpi src/ 
pls_xgrid_component.lo src/pls_xgrid_module.lo src/ 
pls_xgrid_client.lo /Users/conway/programs/openMPI/openmpi-1.2.4/orte/ 
libopen-rte.la /Users/conway/programs/openMPI/openmpi-1.2.4/opal/ 
libopen-pal.la

libtool: link: unable to infer tagged configuration
libtool: link: specify a tag with `--tag'
make[2]: *** [mca_pls_xgrid.la] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1

What I notice here is that despite my specification of the Intel  
compilers on the configure command line (including the correct c++  
icpc compiler!) the libtool command that fails seems to be using gcc  
(... --mode=link gcc ...) on the Xgrid sources. This is part of the  
Modular Component Architecture (MCA) setup (configure.out) and also  
uses gcc for the compiles:


libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I../../../../opal/include  
-I../../../../orte/include -I../../../../ompi/include -I/Users/conway/ 
programs/openMPI/openmpi-1.2.4/include -I../../../.. -D_REENTRANT -g - 
O2 -MT src/pls_xgrid_module.lo -MD -MP -MF src/.deps/ 
pls_xgrid_module.Tpo -c src/pls_xgrid_module.m  -fno-common -DPIC -o  
src/.libs/pls_xgrid_module.o


I wouldn't expect this, but I can't say if it is intended or not.  
This particular error can be avoided by excluding xgrid:

  ./configure CC=icc CXX=icpc F77=ifort F90=ifort --without-xgrid

James Conway


PS. Please note that the instructions for collecting install and make  
information are not quite right, maybe out-of-date. On this page:

  http://www.open-mpi.org/community/help/
the following instruction is given:
  shell% cp config.log share/include/ompi_config.h $HOME/ompi-output
There is no "share" directory in the openMPI area, and the file seems  
instead to be in "ompi":

  ompi/include/ompi_config.h

--
James Conway, PhD.,
Department of Structural Biology
University of Pittsburgh School of Medicine
Biomedical Science Tower 3, Room 2047
3501 5th Ave
Pittsburgh, PA 15260
U.S.A.
Phone: +1-412-383-9847
Fax:   +1-412-648-8998
Email: jxc...@pitt.edu
--





Re: [OMPI users] Make error - MacOSX, Intel v10 compilers and Xgrid MCA

2007-09-29 Thread Brian Barrett

On Sep 29, 2007, at 5:15 PM, James Conway wrote:


What I notice here is that despite my specification of the Intel
compilers on the configure command line (including the correct c++
icpc compiler!) the libtool command that fails seems to be using gcc
(... --mode=link gcc ...) on the Xgrid sources. This is part of the
Modular Component Architecture (MCA) setup (configure.out) and also
uses gcc for the compiles:

libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I../../../../opal/include
-I../../../../orte/include -I../../../../ompi/include -I/Users/conway/
programs/openMPI/openmpi-1.2.4/include -I../../../.. -D_REENTRANT -g -
O2 -MT src/pls_xgrid_module.lo -MD -MP -MF src/.deps/
pls_xgrid_module.Tpo -c src/pls_xgrid_module.m  -fno-common -DPIC -o
src/.libs/pls_xgrid_module.o

I wouldn't expect this, but I can't say if it is intended or not.
This particular error can be avoided by excluding xgrid:
   ./configure CC=icc CXX=icpc F77=ifort F90=ifort --without-xgrid


The XGrid PLS component is actually written in Objective C, as it  
needs to use the XGrid Framework, which is in Objective C.  While gcc  
on OS X is both a C and Objective C compiler, icc is only a C  
compiler.  So gcc is being invoked as the Objective C compiler in  
this case.


Unfortunately, libtool doesn't properly speak Objective C, so when  
the C compiler and Objective C compiler are different, it can get  
confused.  We had a workaround for previous 1.2 releases, but with  
1.2.4, we broke our workaround.  A new, more stable workaround has  
been committed and should be part of the 1.2.5 release.


In the meantime, disabling XGrid will obviously work around the issue.

Brian