Re: [OMPI users] Can 2 IB HCAs give twice the bandwidth?

2008-10-22 Thread Mike Dubman
using 2 HCAs on the same PCI-Exp bus (as well as 2 ports from the same HCA)
will not improve performance, PCI-Exp is the bottleneck.


On Mon, Oct 20, 2008 at 2:28 AM, Mostyn Lewis  wrote:

> Well, here's what I see with the IMB PingPong test using two ConnectX DDR
> cards
> in each of 2 machines. I'm just quoting the last line at 10 repetitions of
> the 4194304 bytes.
>
> Scali_MPI_Connect-5.6.4-59151: (multi rail setup in /etc/dat.conf)
>   #bytes #repetitions  t[usec]   Mbytes/sec
>  4194304   10  2198.24  1819.63
> mvapich2-1.2rc2: (MV2_NUM_HCAS=2 MV2_NUM_PORTS=1)
>   #bytes #repetitions  t[usec]   Mbytes/sec
>  4194304   10  2427.24  1647.96
> OpenMPI SVN 19772:
>   #bytes #repetitions  t[usec]   Mbytes/sec
>  4194304   10  3676.35  1088.03
>
> Repeatable within bounds.
>
> This is OFED-1.3.1 and I peered at
> /sys/class/infiniband/mlx4_0/ports/1/counters/port_rcv_packets
> and
> /sys/class/infiniband/mlx4_1/ports/1/counters/port_rcv_packets
> on one of the 2 machines and looked at what happened for Scali
> and OpenMPI.
>
> Scali packets:
> HCA 0 port1 = 115116625 - 114903198 = 213427
> HCA 1 port1 =  78099566 -  77886143 = 213423
> 
>  426850
> OpenMPI packets:
> HCA 0 port1 = 115233624 - 115116625 = 116999
> HCA 1 port1 =  78216425 -  78099566 = 116859
> 
>  233858
>
> Scali is set up so that data larger than 8192 bytes is striped
> across the 2 HCAs using 8192 bytes per HCA in a round robin fashion.
>
> So, it seems that OpenMPI is using both ports but strangley ends
> up with a Mbytes/sec rate which is worse than a single HCA only.
> If I use a --mca btl_openib_if_exclude mlx41:1, we get
>   #bytes #repetitions  t[usec]   Mbytes/sec
>  4194304   10  3080.59  1298.45
>
> So, what's taking so long? Is this a threading question?
>
> DM
>
>
> On Sun, 19 Oct 2008, Jeff Squyres wrote:
>
>  On Oct 18, 2008, at 9:19 PM, Mostyn Lewis wrote:
>>
>>  Can OpenMPI do like Scali and MVAPICH2 and utilize 2 IB HCAs per machine
>>> to approach double the bandwidth on simple tests such as IMB PingPong?
>>>
>>
>>
>> Yes.  OMPI will automatically (and aggressively) use as many active ports
>> as you have.  So you shouldn't need to list devices+ports -- OMPI will
>> simply use all ports that it finds in the active state.  If your ports are
>> on physically separate IB networks, then each IB network will require a
>> different subnet ID so that OMPI can compute reachability properly.
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] OpenMPI runtime-specific environment variable?

2008-10-22 Thread Jed Brown
On Wed 2008-10-22 00:40, Reuti wrote:
>
> Okay, now I see. Why not just call MPI_Comm_size(MPI_COMM_WORLD,  
> &nprocs) When nprocs is 1, it's a serial run. It can also be executed  
> when not running within mpirun AFAICS.

This is absolutely NOT okay.  You cannot call any MPI functions before
MPI_Init (and at least OMPI 1.2+ and MPICH2 1.1a will throw an error if
you try).

I'm slightly confused about the original problem.  Is the program linked
against an MPI when running in serial?  You have to recompile anyway if
you change MPI implementation, so if it's not linked against a real MPI
then you know at compile time.  But what is the problem with calling
MPI_Init for a serial job?  All implementations I've used allow you to
call MPI_Init when the program is run as ./foo (no mpirun).

Jed


pgpQQUWd4bLAL.pgp
Description: PGP signature


Re: [OMPI users] OpenMPI runtime-specific environment variable?

2008-10-22 Thread Reuti

Am 22.10.2008 um 10:30 schrieb Jed Brown:


On Wed 2008-10-22 00:40, Reuti wrote:


Okay, now I see. Why not just call MPI_Comm_size(MPI_COMM_WORLD,
&nprocs) When nprocs is 1, it's a serial run. It can also be executed
when not running within mpirun AFAICS.


This is absolutely NOT okay.  You cannot call any MPI functions before
MPI_Init (and at least OMPI 1.2+ and MPICH2 1.1a will throw an  
error if

you try).


Sorry for being not precise. Of course you have to call MPI_Init  
(which will also return successful when called while not being  
started with mpirun), then check the number of cores you got with the  
mentioned call, and if it's just one you can call MPI_Finalize  
immediately and continue with a serial run.


-- Reuti


I'm slightly confused about the original problem.  Is the program  
linked
against an MPI when running in serial?  You have to recompile  
anyway if
you change MPI implementation, so if it's not linked against a real  
MPI

then you know at compile time.  But what is the problem with calling
MPI_Init for a serial job?  All implementations I've used allow you to
call MPI_Init when the program is run as ./foo (no mpirun).

Jed
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] OpenMPI runtime-specific environment variable?

2008-10-22 Thread Jeff Squyres
I wonder if it would be useful to have an OMPI-specific extension for  
this kind of functionality, perhaps OMPI_Was_launched_by_mpirun() (or  
something with a better name, etc.)...?


This would be a pretty easy function for us to provide (right  
Ralph?).  My question is -- would this (and perhaps other similar  
extensions) be useful to the community at large?




On Oct 21, 2008, at 5:46 PM, Adams, Brian M wrote:


I'm not sure I understand the problem. The ale3d program from
LLNL operates exactly as you describe and it can be built
with mpich, lam, or openmpi.


Hi Doug,

I'm not sure what reply would be most helpful, so here's an attempt.

It sounds like we're on the same page with regard to the desired  
behavior.  Historically, we've been able to detect serial vs.  
parallel launch of the binary foo, with a whole host of  
implementations, including those you mention, as well as some vendor- 
specific implementations (possibly including DEC/OSF, SGI, Sun, and  
AIX/poe, though I don't know all the details).  We typically  
distinguish serial from parallel executions on the basis of  
environment variables set only in the MPI runtime environment.  I  
was just trying to ascertain what variable would be best to test for  
in an OpenMPI environment, and I think Ralph helped with that.


If the ale3d code takes a different approach, I'd love to hear about  
it, off-list if necessary.


Brian


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



[OMPI users] Fwd: Problems installing in Cygwin

2008-10-22 Thread Gustavo Seabra
Hi All,

(Sorry if you already got this message befor, but since I didn't get
any answer, I'm assuming it didn't get through to the list.)

I am trying to install OpenMPI in Cygwin. from a cygwin bash shell, I
configured OpenMPI with the command below:

$ echo $MPI_HOME
/home/seabra/local/openmpi-1.2.7
$ ./configure --prefix=$MPI_HOME \
   --with-mpi-param_check=always \
   --with-threads=posix \
   --enable-mpi-threads \
   --disable-io-romio \
   FC="g95" FFLAGS="-O0  -fno-second-underscore" \
   CXX="g++"

The configuration *seems* to be OK (it finishes with: "configure: exit
0"). However, when I try to install it, the installation finishes with
the error below. I wonder if anyone here could help me figure out what
is going wrong.

Thanks a lot!
Gustavo.

==
$ make clean
[...]
$ make install
[...]
Making install in mca/timer/windows
make[2]: Entering directory
`/home/seabra/local/openmpi-1.2.7/opal/mca/timer/windows'
depbase=`echo timer_windows_component.lo | sed 's|[^/]*$|.deps/&|;s|\.lo$||'`;\
   /bin/sh ../../../../libtool --tag=CC   --mode=compile gcc
-DHAVE_CONFIG_H -I. -I../../../../opal/include
-I../../../../orte/include -I../../../../ompi/include   -I../../../..
-D_REENTRANT  -O3 -DNDEBUG -finline-functions -fno-strict-aliasing
-MT timer_windows_component.lo -MD -MP -MF $depbase.Tpo -c -o
timer_windows_component.lo timer_windows_component.c &&\
   mv -f $depbase.Tpo $depbase.Plo
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I../../../../opal/include
-I../../../../orte/include -I../../../../ompi/include -I../../../..
-D_REENTRANT -O3 -DNDEBUG -finline-functions -fno-strict-aliasing -MT
timer_windows_component.lo -MD -MP -MF
.deps/timer_windows_component.Tpo -c timer_windows_component.c
-DDLL_EXPORT -DPIC -o .libs/timer_windows_component.o
timer_windows_component.c:22:60:
opal/mca/timer/windows/timer_windows_component.h: No such file or
directory
timer_windows_component.c:25: error: parse error before
"opal_timer_windows_freq"
timer_windows_component.c:25: warning: data definition has no type or
storage class
timer_windows_component.c:26: error: parse error before
"opal_timer_windows_start"
timer_windows_component.c:26: warning: data definition has no type or
storage class
timer_windows_component.c: In function `opal_timer_windows_open':
timer_windows_component.c:60: error: `LARGE_INTEGER' undeclared (first
use in this function)
timer_windows_component.c:60: error: (Each undeclared identifier is
reported only once
timer_windows_component.c:60: error: for each function it appears in.)
timer_windows_component.c:60: error: parse error before "now"
timer_windows_component.c:62: error: `now' undeclared (first use in
this function)
make[2]: *** [timer_windows_component.lo] Error 1
make[2]: Leaving directory
`/home/seabra/local/openmpi-1.2.7/opal/mca/timer/windows'
make[1]: *** [install-recursive] Error 1
make[1]: Leaving directory `/home/seabra/local/openmpi-1.2.7/opal'
make: *** [install-recursive] Error 1


Re: [OMPI users] OpenMPI runtime-specific environment variable?

2008-10-22 Thread Ralph Castain

We could - though it isn't clear that it really accomplishes anything.

I believe some of the suggestions on this thread have forgotten about  
singletons. If the code calls MPI_Init, we treat that as a singleton  
and immediately set all the MPI environmental params - which means the  
proc's environment now looks exactly as if it had been launched by  
mpirun. That is by design for proper singleton operation. So doing  
anything that starts with MPI_Init isn't going to work.


What I think Brian is trying to do is detect that his code was not  
launched by mpirun -prior- to calling MPI_Init so he can decide if he  
wants to do that at all. Checking for the enviro params I suggested is  
a good way to do it - I'm not sure that adding another one really  
helps. The key issue is having something he can rely on, and I think  
the ones I suggested are probably his best bet for OMPI.


What would be nice is if you MPI Forum types could agree on an MPI- 
standard way of doing this so Brian wouldn't have to check a dozen  
different variations... :-)


Ralph

On Oct 22, 2008, at 7:21 AM, Jeff Squyres wrote:

I wonder if it would be useful to have an OMPI-specific extension  
for this kind of functionality, perhaps  
OMPI_Was_launched_by_mpirun() (or something with a better name,  
etc.)...?


This would be a pretty easy function for us to provide (right  
Ralph?).  My question is -- would this (and perhaps other similar  
extensions) be useful to the community at large?




On Oct 21, 2008, at 5:46 PM, Adams, Brian M wrote:


I'm not sure I understand the problem. The ale3d program from
LLNL operates exactly as you describe and it can be built
with mpich, lam, or openmpi.


Hi Doug,

I'm not sure what reply would be most helpful, so here's an attempt.

It sounds like we're on the same page with regard to the desired  
behavior.  Historically, we've been able to detect serial vs.  
parallel launch of the binary foo, with a whole host of  
implementations, including those you mention, as well as some  
vendor-specific implementations (possibly including DEC/OSF, SGI,  
Sun, and AIX/poe, though I don't know all the details).  We  
typically distinguish serial from parallel executions on the basis  
of environment variables set only in the MPI runtime environment.   
I was just trying to ascertain what variable would be best to test  
for in an OpenMPI environment, and I think Ralph helped with that.


If the ale3d code takes a different approach, I'd love to hear  
about it, off-list if necessary.


Brian


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Can OpenMPI support multiple compilers?

2008-10-22 Thread Jeff Squyres
Also check the FAQ on how to use the wrapper compilers -- there are  
ways to override at compile time, but be warned that it's not always  
what you want.  As Terry indicates, you probably want to have multiple  
OMPI installations -- one for each compiler.


In particular, there are problems with mixing multiple Fortran  
compilers noted in OMPI's README file:


- Open MPI will build bindings suitable for all common forms of
  Fortran 77 compiler symbol mangling on platforms that support it
  (e.g., Linux).  On platforms that do not support weak symbols (e.g.,
  OS X), Open MPI will build Fortran 77 bindings just for the compiler
  that Open MPI was configured with.

  Hence, on platforms that support it, if you configure Open MPI with
  a Fortran 77 compiler that uses one symbol mangling scheme, you can
  successfully compile and link MPI Fortran 77 applications with a
  Fortran 77 compiler that uses a different symbol mangling scheme.

  NOTE: For platforms that support the multi-Fortran-compiler bindings
  (i.e., weak symbols are supported), due to limitations in the MPI
  standard and in Fortran compilers, it is not possible to hide these
  differences in all cases.  Specifically, the following two cases may
  not be portable between different Fortran compilers:

  1. The C constants MPI_F_STATUS_IGNORE and MPI_F_STATUSES_IGNORE
 will only compare properly to Fortran applications that were
 created with Fortran compilers that that use the same
 name-mangling scheme as the Fortran compiler that Open MPI was
 configured with.

  2. Fortran compilers may have different values for the logical
 .TRUE. constant.  As such, any MPI function that uses the Fortran
 LOGICAL type may only get .TRUE. values back that correspond to
 the the .TRUE. value of the Fortran compiler that Open MPI was
 configured with.  Note that some Fortran compilers allow forcing
 .TRUE. to be 1 and .FALSE. to be 0.  For example, the Portland
 Group compilers provide the "-Munixlogical" option, and Intel
 compilers (version >= 8.) provide the "-fpscomp logicals" option.

  You can use the ompi_info command to see the Fortran compiler that
  Open MPI was configured with.


On Oct 19, 2008, at 8:34 PM, Terry Frankcombe wrote:


It happily supports multiple compilers on the same system, but not in
the way you mean.  You need another installation of OMPI (in,
say, /usr/lib64/mpi/intel) for icc/ifort.

Select by path manipulation.

On Mon, 2008-10-20 at 08:19 +0800, Wen Hao Wang wrote:

Hi all:

I have openmpi 1.2.5 installed on SLES10 SP2. These packages should  
be

compiled with gcc compilers. Now I have installed Intel C++ and
Fortran compilers on my cluster. Can openmpi use Intel compilers
withour recompiling?

I tried to use environment variable to indicate Intel compiler, but  
it

seems the mpi commands still wanted to use gcc ones.
LS21-08:/opt/intel/fce/10.1.018/bin # mpif77 --showme
gfortran -I/usr/lib64/mpi/gcc/openmpi/include -pthread
-L/usr/lib64/mpi/gcc/openmpi/lib64 -lmpi_f77 -lmpi -lopen-rte
-lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
LS21-08:/opt/intel/fce/10.1.018/bin # export
F77=/opt/intel/fce/10.1.018/bin/ifort
LS21-08:/opt/intel/fce/10.1.018/bin # rpm -e
gcc-fortran-4.1.2_20070115-0.21
LS21-08:/opt/intel/fce/10.1.018/bin # mpif77 /LTC/matmul-for-intel.f
--
The Open MPI wrapper compiler was unable to find the specified
compiler
gfortran in your PATH.

Note that this compiler was either specified at configure time or in
one of several possible environment variables.

--

Is it possible to change openmpi's underlying compiler? Thus I can  
use

multiple compilers on one machine.

Thanks in advance!

Steven Wang
Email: wangw...@cn.ibm.com

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] ga-4.1 on mx segmentation violation

2008-10-22 Thread Jeff Squyres

On Oct 21, 2008, at 9:14 AM, SLIM H.A. wrote:


I have built the release candidate for ga-4.1 with OpenMPI 1.2.3 and
portland compilers 7.0.2 for Myrinet mx.

Running the test.x for 3 Myrinet nodes each with 4 cores I get the
following error messages:

warning:regcache incompatible with malloc
libibverbs: Fatal: couldn't read uverbs ABI version.

--
[0,1,3]: OpenIB on host node057 was unable to find any HCAs.
Another transport will be used instead, although this may result in
lower performance.
---


FWIW, this specific warning is fixed in the upcoming v1.3 series (I  
assume you built on a machine with libibverbs installed, but no  
OpenFabrics-capable devices).


IIRC, you can manually disable this warning by telling Open MPI to  
avoid the openib BTL (I can't test the v1.2 series on a linux machine  
ATM to verify this):


  mpirun --mca btl ^openib ...


ARMCI configured for 3 cluster nodes. Network protocol is 'MPI-SPAWN'.
0:Segmentation Violation error, status=: 11
0:ARMCI DASSERT fail. signaltrap.c:SigSegvHandler():299 cond:0
4:Segmentation Violation error, status=: 11
4:ARMCI DASSERT fail. signaltrap.c:SigSegvHandler():299 cond:0
6:Segmentation Violation error, status=: 11
6:ARMCI DASSERT fail. signaltrap.c:SigSegvHandler():299 cond:0


It looks like ARMCI is seg faulting...?  Beyond that, Bad Things will  
happen at the MPI layer before it aborts.


I'm unfamiliar with "ga" or ARMCI, so I don't know exactly what's  
happening here...


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] test OpenMPI without Internet access

2008-10-22 Thread Jeff Squyres

On Oct 19, 2008, at 7:05 PM, Wen Hao Wang wrote:

I have one cluster without Internet connection. I want to test  
OpenMPI functions on it. It seems MTT can not be used. Do I have any  
other choice for the testing?




You can always run tests manually.  MTT is simply our harness for  
automated testing, which *usually* (but not always) involves  
downloading the latest nightly snapshot from the IU web site.


You can certainly configure MTT to use a local copy of Open MPI and  
not use the IU web nightly snapshot.


I have tried lamtest. "make -k check" gave a lot of IB related  
warnings, indicating that my dat.conf file contained invalid entry.  
Each machine of my cluster has one IB connectX adapter installed.  
But do not know why lamtest detected that.





You must have built with udapl support.  Open MPI will use as many  
interfaces as you have built for; if you have built support for udapl  
(which I do not believe we build by default on Linux -- you have to  
specifically ask for it to be built because [among other reasons] OMPI  
would prefer to use verbs, not udapl), then we initialize udapl at run  
time and if you have an illegal dat.conf file, then I'm guessing udapl  
is complaining about it.  I don't know very much about udapl, so I  
can't give good guidance here (other than suggesting not building  
udapl support and just testing the native verbs support, which is  
where we put all of our effort for Linux/OpenFabrics/Open MPI support).


OMPI's udapl support is mainly for Sun, because udapl *is* [currently]  
their high-performance network stack.


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] ga-4.1 on mx segmentation violation

2008-10-22 Thread Patrick Geoffray

SLIM H.A. wrote:

I have built the release candidate for ga-4.1 with OpenMPI 1.2.3 and
portland compilers 7.0.2 for Myrinet mx.


Which version of ARMCI and MX ?


ARMCI configured for 3 cluster nodes. Network protocol is 'MPI-SPAWN'.
0:Segmentation Violation error, status=: 11
0:ARMCI DASSERT fail. signaltrap.c:SigSegvHandler():299 cond:0


Segfault caught by ARMCI signal handler. You may want to run it under 
gdb, to see where the segfault really comes from.


Patrick


[OMPI users] ADIOI_GEN_DELETE

2008-10-22 Thread Davi Vercillo C. Garcia (ダヴィ)
Hi,

I'm trying to run a code using OpenMPI and I'm getting this error:

ADIOI_GEN_DELETE (line 22): **io No such file or directory

I don't know why this occurs, I only know this happens when I use more
than one process.

The code can be found at: http://pastebin.com/m149a1302

-- 
Davi Vercillo Carneiro Garcia
http://davivercillo.blogspot.com/

Universidade Federal do Rio de Janeiro
Departamento de Ciência da Computação
DCC-IM/UFRJ - http://www.dcc.ufrj.br

Grupo de Usuários GNU/Linux da UFRJ (GUL-UFRJ)
http://www.dcc.ufrj.br/~gul

Linux User: #388711
http://counter.li.org/

"Theory is when you know something, but it doesn't work. Practice is
when something works, but you don't know why.
Programmers combine theory and practice: Nothing works and they don't
know why." - Anon