Re: [OMPI users] infiniband

2008-05-01 Thread Pavel Shamis (Pasha)

Another nice tools for ib monitoring.

1. perfquery (part of  OFED), example of report:

Port counters: Lid 12 port 1
PortSelect:..1
CounterSelect:...0x
SymbolErrors:7836
LinkRecovers:255
LinkDowned:..0
RcvErrors:...24058
RcvRemotePhysErrors:.6159
RcvSwRelayErrors:0
XmtDiscards:.3176
XmtConstraintErrors:.0
RcvConstraintErrors:.0
LinkIntegrityErrors:.0
ExcBufOverrunErrors:.0
VL15Dropped:.0
XmtData:.1930
RcvData:.1708
XmtPkts:.114
RcvPkts:.114

2. collectl - http://collectl.sourceforge.net/, example of report:

#<---Memory--><--InfiniBand-->
#cpu sys inter  ctxsw free buff cach inac slab  map   KBin  pktIn  KBOut 
pktOut Errs
  1   0   847   1273   1G 264M   3G 594M   1G 234M  2 29  
2 29 123242
  2   1   851   2578   1G 264M   3G 594M   1G 234M  1  5  
1  5 123391




Pavel Shamis (Pasha) wrote:

SLIM H.A. wrote:
  

Is it possible to get information about the usage of hca ports similar
to the result of the mx_endpoint_info command for Myrinet boards?

The ibstat command gives information like this:

Port 1:
State: Active
Physical state: LinkUp

but does not say whether a job is actually using an infiniband port or
comunicates through plain ethernet. 


I would be grateful for any advice
  

You have access to some counters in 
/sys/class/infiniband/mlx4_0/ports/1/counters/  (counters for hca - 
mlx4_0 , port 1)


  



--
Pavel Shamis (Pasha)
Mellanox Technologies



[OMPI users] Enabling progress thread

2008-05-01 Thread Alberto Giannetti
In message http://www.open-mpi.org/community/lists/users/ 
2007/03/2889.php I found this comment:


"The only way to get any
benefit from the MPI_Bsend is to have a progress thread which take
care of the pending communications in the background. Such thread is
not enabled by default in Open MPI."

I understand this won't be portable, but how do you enable a sender  
progress thread in Open MPI?


Re: [OMPI users] Enabling progress thread

2008-05-01 Thread Aurélien Bouteiller
You can add --enable-progress-threads to the configure. However,  
please consider this as a beta feature. We know for sure there is some  
bugs in current thread safety.


Aurelien

Le 1 mai 08 à 09:46, Alberto Giannetti a écrit :


In message http://www.open-mpi.org/community/lists/users/
2007/03/2889.php I found this comment:

"The only way to get any
benefit from the MPI_Bsend is to have a progress thread which take
care of the pending communications in the background. Such thread is
not enabled by default in Open MPI."

I understand this won't be portable, but how do you enable a sender
progress thread in Open MPI?
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





[OMPI users] Install BLACS and ScaLAPACK on Leopard

2008-05-01 Thread Linwei Wang

Dear all,

   I'm new to openmpi. I'm now trying to use BLACS and ScaLAPACK on  
Leopard.  Since it has built-in Open MPI, I didn't install any other
versions. I followed the BLACS install guidances in FAQ section, and  
it generated errors as:


   "No rule to make target `/usr/include/mpif.h', needed by  
`mpif.h'.  Stop."


   The problem is I could not find "mpif.h" in my computer. Does this  
mean I should install other Open MPI version rather than using Leopard's

built-in version?

  Thanks for the help!

  Best,
 Linwei



[OMPI users] Running Hetergenous MPI Application Over Infiniband

2008-05-01 Thread Ryan Buckley ; 21426
Hello, 

I am trying to run a simple Hello World MPI application in a
heterogeneous environment.  The machines include 1 x86 machine with a
standard 1Gb ethernet connection and 2 ppc machines with standard 1Gb
ethernet as well as a 10Gb ethernet (Infiniband) switch between the two.
The Hello World program is the same hello_c.c that is included in the
examples directory of the Open MPI installation.

The goal is that I would like to run heterogeneous applications between
the three aforementioned machines in the following manner:

The x86 machine will use tcp to communicate to the 2 ppc machines,
while the ppc machines will communicate with one another via the 10GbE.

x86 <--tcp--> ppc_1
x86 <--tcp--> ppc_2
ppc1 <--openib--> ppc_2

I am currently using a machfile set up as follows,

# cat machfile




In addition I am using an appfile set up as follows, 

# cat appfile
-np 1 --hostfile machfile --host  --mca btl
sm,self,tcp,openib /path/to/ppc/openmpi-1.2.5/examples/hello
-np 1 --hostfile machfile --host  --mca btl
sm,self,tcp,openib /path/to/ppc/openmpi-1.2.5/examples/hello
-np 1 --hostfile machfile --host  --mca btl
sm,self,tcp /path/to/x86/openmpi-1.2.5/examples/hello

I am running on the command line via

# mpirun --app appfile

I've also attached the output from 'ompi_info --all' from all machines.

Any suggestions would be much appreciated.

Thanks, 

Ryan

Open MPI: 1.2.5
   Open MPI SVN revision: r16989
Open RTE: 1.2.5
   Open RTE SVN revision: r16989
OPAL: 1.2.5
   OPAL SVN revision: r16989
   MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2.5)
  MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2.5)
   MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.5)
   MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2.5)
   MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.5)
 MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.5)
 MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.5)
   MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
   MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.5)
MCA coll: self (MCA v1.0, API v1.0, Component v1.2.5)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.5)
MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.5)
  MCA io: romio (MCA v1.0, API v1.0, Component v1.2.5)
   MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.5)
   MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.5)
 MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.5)
 MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.5)
 MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.5)
  MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.5)
 MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.5)
 MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.5)
 MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.5)
 MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.5)
  MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.5)
  MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.5)
  MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.5)
 MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.5)
 MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.5)
 MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2.5)
 MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.5)
 MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.5)
  MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.5)
  MCA ns: replica (MCA v1.0, API v2.0, Component v1.2.5)
 MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
 MCA ras: dash_host (MCA v1.0, API v1.3, Component v1.2.5)
 MCA ras: localhost (MCA v1.0, API v1.3, Component v1.2.5)
 MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.2.5)
 MCA ras: slurm (MCA v1.0, API v1.3, Component v1.2.5)
 MCA rds: hostfile (MCA v1.0, API v1.3, Component v1.2.5)
 MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.5)
 MCA rds: resfile (MCA v1.0, API v1.3, Component v1.2.5)
   MCA rmaps: round_robin (MCA v1.0, API v1.3, Component v1.2.5)
MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.5)
MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.5)
 MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.5)
 MCA pls:

Re: [OMPI users] Install BLACS and ScaLAPACK on Leopard

2008-05-01 Thread Doug Reeder

Linwei,

mpif.h is the include file for fortran programs to use openmpi. The  
apple version does not support fortran. If you want to use openmpi  
from fortran you will need to install a version of openmpi that  
supports fortran, this will install mpif.h. I suggest you install the  
new version in a different directory than the apple version ( use -- 
prefix in the openmpi configure command). You will also need to  
remove the apple version or rename the openmpi include and library  
files so that the linker can find your new, fortran supporting version.


Doug Reeder
On May 1, 2008, at 8:42 AM, Linwei Wang wrote:


Dear all,

I'm new to openmpi. I'm now trying to use BLACS and ScaLAPACK on
Leopard.  Since it has built-in Open MPI, I didn't install any other
versions. I followed the BLACS install guidances in FAQ section, and
it generated errors as:

"No rule to make target `/usr/include/mpif.h', needed by
`mpif.h'.  Stop."

The problem is I could not find "mpif.h" in my computer. Does this
mean I should install other Open MPI version rather than using  
Leopard's

built-in version?

   Thanks for the help!

   Best,
  Linwei

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] users Digest, Vol 885, Issue 2

2008-05-01 Thread Ryan Buckley ; 21426
The problem is that when running over InfiniBand the application hangs
on the call to MPI_Init.

Thanks, 

Ryan


On Thu, 2008-05-01 at 12:02 -0400, users-requ...@open-mpi.org wrote:
> Send users mailing list submissions to
>   us...@open-mpi.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>   http://www.open-mpi.org/mailman/listinfo.cgi/users
> or, via email, send a message with subject or body 'help' to
>   users-requ...@open-mpi.org
> 
> You can reach the person managing the list at
>   users-ow...@open-mpi.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of users digest..."
> 
> 
> Today's Topics:
> 
>1. Running Hetergenous MPI Application Over Infiniband
>   (Ryan Buckley ; 21426)
> 
> 
> --
> 
> Message: 1
> Date: Thu, 01 May 2008 12:02:01 -0400
> From: "Ryan Buckley ; 21426" 
> Subject: [OMPI users] Running Hetergenous MPI Application Over
>   Infiniband
> To: us...@open-mpi.org
> Message-ID: <1209657721.6518.28.camel@localhost>
> Content-Type: text/plain; charset="us-ascii"
> 
> Hello, 
> 
> I am trying to run a simple Hello World MPI application in a
> heterogeneous environment.  The machines include 1 x86 machine with a
> standard 1Gb ethernet connection and 2 ppc machines with standard 1Gb
> ethernet as well as a 10Gb ethernet (Infiniband) switch between the two.
> The Hello World program is the same hello_c.c that is included in the
> examples directory of the Open MPI installation.
> 
> The goal is that I would like to run heterogeneous applications between
> the three aforementioned machines in the following manner:
> 
>   The x86 machine will use tcp to communicate to the 2 ppc machines,
> while the ppc machines will communicate with one another via the 10GbE.
> 
>   x86 <--tcp--> ppc_1
>   x86 <--tcp--> ppc_2
>   ppc1 <--openib--> ppc_2
> 
> I am currently using a machfile set up as follows,
> 
> # cat machfile
> 
> 
> 
> 
> In addition I am using an appfile set up as follows, 
> 
> # cat appfile
> -np 1 --hostfile machfile --host  --mca btl
> sm,self,tcp,openib /path/to/ppc/openmpi-1.2.5/examples/hello
> -np 1 --hostfile machfile --host  --mca btl
> sm,self,tcp,openib /path/to/ppc/openmpi-1.2.5/examples/hello
> -np 1 --hostfile machfile --host  --mca btl
> sm,self,tcp /path/to/x86/openmpi-1.2.5/examples/hello
> 
> I am running on the command line via
> 
> # mpirun --app appfile
> 
> I've also attached the output from 'ompi_info --all' from all machines.
> 
> Any suggestions would be much appreciated.
> 
> Thanks, 
> 
> Ryan
> 
> -- next part --
> Open MPI: 1.2.5
>Open MPI SVN revision: r16989
> Open RTE: 1.2.5
>Open RTE SVN revision: r16989
> OPAL: 1.2.5
>OPAL SVN revision: r16989
>MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2.5)
>   MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2.5)
>MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.5)
>MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2.5)
>MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.5)
>  MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.5)
>  MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.5)
>MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
>MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
> MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.5)
> MCA coll: self (MCA v1.0, API v1.0, Component v1.2.5)
> MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.5)
> MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.5)
>   MCA io: romio (MCA v1.0, API v1.0, Component v1.2.5)
>MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.5)
>MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.5)
>  MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.5)
>  MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.5)
>  MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.5)
>   MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.5)
>  MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.5)
>  MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.5)
>  MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
> MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.5)
>  MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.5)
>   MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.5)
>   MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.5)
>   MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.5)
>  MCA gpr: n

Re: [OMPI users] Install BLACS and ScaLAPACK on Leopard

2008-05-01 Thread Linwei Wang

Dear Doug,

   Thanks very much.  I installed the latest OpenMPI and  BLACS. For  
the ScaLAPACK, I had some problem related to
BLAS library. Since in mac, it is in the vecLib, I've no idea how to  
set the BLASLIB in the SLmake.inc file for ScaLAPACK.


also, though compiling BLAS succeed, I'm not able to build the  
testers,  It generated large amounts of output as follows. Do you

have any idea what is the problem?

/usr/local/openmpi-1.2.6/bin/mpif77  -c blacstest.f
blacstest.f: In subroutine `runtests':
blacstest.f:150: warning:
CALL RUNTESTS( MEM, MEMLEN, CMEM, CMEMSIZ, PREC, NPREC,  
OUTNUM,

 1
blacstest.f:178: (continued):
 SUBROUTINE RUNTESTS( MEM, MEMLEN, CMEM, CMEMLEN, PREC, NPREC,
2
Argument #1 (named `mem') of `runtests' is one type at (2) but is some  
other type at (1) [info -f g77 M GLOBALS]

blacstest.f: In subroutine `ssdrvtest':
blacstest.f:299: warning:
  CALL SSDRVTEST(OUTNUM, VERB, NSHAPE, CMEM(UPLOPTR),
   1
blacstest.f:2545: (continued):
 SUBROUTINE SSDRVTEST( OUTNUM, VERB, NSHAPE, UPLO0, DIAG0,
2
Argument #21 (named `mem') of `ssdrvtest' is one type at (2) but is  
some other type at (1) [info -f g77 M GLOBALS]

blacstest.f: In subroutine `dsdrvtest':
blacstest.f:311: warning:
  CALL DSDRVTEST(OUTNUM, VERB, NSHAPE, CMEM(UPLOPTR),
   1
blacstest.f:2889: (continued):
 SUBROUTINE DSDRVTEST( OUTNUM, VERB, NSHAPE, UPLO0, DIAG0,
2
Argument #21 (named `mem') of `dsdrvtest' is one type at (2) but is  
some other type at (1) [info -f g77 M GLOBALS]

blacstest.f: In subroutine `csdrvtest':
blacstest.f:323: warning:
  CALL CSDRVTEST(OUTNUM, VERB, NSHAPE, CMEM(UPLOPTR),
   1
blacstest.f:3233: (continued):
 SUBROUTINE CSDRVTEST( OUTNUM, VERB, NSHAPE, UPLO0, DIAG0,
2
Argument #21 (named `mem') of `csdrvtest' is one type at (2) but is  
some other type at (1) [info -f g77 M GLOBALS]

blacstest.f: In subroutine `zsdrvtest':
blacstest.f:335: warning:
  CALL ZSDRVTEST(OUTNUM, VERB, NSHAPE, CMEM(UPLOPTR),
   1
blacstest.f:3577: (continued):
 SUBROUTINE ZSDRVTEST( OUTNUM, VERB, NSHAPE, UPLO0, DIAG0,
2
Argument #21 (named `mem') of `zsdrvtest' is one type at (2) but is  
some other type at (1) [info -f g77 M GLOBALS]

blacstest.f: In subroutine `sbsbrtest':
blacstest.f:389: warning:
  CALL SBSBRTEST(OUTNUM, VERB, NSCOPE, CMEM(SCOPEPTR),
   1
blacstest.f:4336: (continued):
 SUBROUTINE SBSBRTEST( OUTNUM, VERB, NSCOPE, SCOPE0, NTOP,  
TOP0,

2
Argument #23 (named `mem') of `sbsbrtest' is one type at (2) but is  
some other type at (1) [info -f g77 M GLOBALS]

blacstest.f: In subroutine `dbsbrtest':
blacstest.f:401: warning:
  CALL DBSBRTEST(OUTNUM, VERB, NSCOPE, CMEM(SCOPEPTR),
   1
blacstest.f:4751: (continued):
 SUBROUTINE DBSBRTEST( OUTNUM, VERB, NSCOPE, SCOPE0, NTOP,  
TOP0,

2
Argument #23 (named `mem') of `dbsbrtest' is one type at (2) but is  
some other type at (1) [info -f g77 M GLOBALS]

blacstest.f: In subroutine `cbsbrtest':
blacstest.f:413: warning:
  CALL CBSBRTEST(OUTNUM, VERB, NSCOPE, CMEM(SCOPEPTR),
   1
blacstest.f:5166: (continued):
 SUBROUTINE CBSBRTEST( OUTNUM, VERB, NSCOPE, SCOPE0, NTOP,  
TOP0,

2
Argument #23 (named `mem') of `cbsbrtest' is one type at (2) but is  
some other type at (1) [info -f g77 M GLOBALS]

blacstest.f: In subroutine `zbsbrtest':
blacstest.f:425: warning:
  CALL ZBSBRTEST(OUTNUM, VERB, NSCOPE, CMEM(SCOPEPTR),
   1
blacstest.f:5581: (continued):
 SUBROUTINE ZBSBRTEST( OUTNUM, VERB, NSCOPE, SCOPE0, NTOP,  
TOP0,

2
Argument #23 (named `mem') of `zbsbrtest' is one type at (2) but is  
some other type at (1) [info -f g77 M GLOBALS]

blacstest.f: In subroutine `sbtcheckin':
blacstest.f:120: warning:
CALL BTRECV( 3, 2, ITMP, 0, IBTMSGID() )
 1
blacstest.f:7429: (continued):
  CALL BTRECV(4, NERR2*2, SVAL, K, IBTMSGID()+51)
   2
Argument #3 of `btrecv' is one type at (2) but is some other type at  
(1) [info -f g77 M GLOBALS]

blacstest.f:97: warning:
CALL BTSEND( 3, 2, ITMP, -1, IBTMSGID() )
 1
blacstest.f:7451: (continued):
   CALL BTSEND(4, NERR*2, SVAL, 0, IBTMSGID()+51)
2
Argument #3 of `btsend' is one type at (2) but is some other type at  
(1) [info -f g77 M GLOBALS]

blacstest.f:2824: warning:
CALL SBTCHECKIN( 0, OUTNUM, MAXERR, NERR,
 1
blacstest.f:7339: (continued):
 SUBROUTINE SBTCHECKIN( NFTESTS, OUTNUM, MAXERR, NER