Re: [O-MPI users] Hpl Bench mark and Openmpi rc3

2005-10-17 Thread Jeff Squyres

On Oct 13, 2005, at 1:25 AM, Allan Menezes wrote:

   I have a 16 node cluster of x86 machines with FC3 running on the 
head

node. I used a beta version of OSCAR 4.2 for putting together the
cluster. It uses /home/allan as the NFS directory.


Greetings Allan.  Sorry for the delay in replying -- we were all at an 
Open MPI working meeting last week, and the schedule got a bit hectic.


Your setup sounds find.


I tried Mpich2v1.02p1 and got abench mark of 26GFlops for it approx.
WIth open mpi 1.0RC3 having set the LD_LIBRARY_PATH in .bashrc and the
/opt/openmpi/bin path in .bash_profile in the home directory


Two quick notes here:

- Open MPI's mpirun supports the "--prefix" option which obviates 
needing to set these variables in your .bashrc (although setting them 
in permanently makes things easier in the long term -- you don't need 
to always specify --prefix).  See the FAQ for more details on the 
--prefix option:


http://www.open-mpi.org/faq/?category=running#mpirun-prefix

- OSCAR makes use of environment modules; it contains setup to 
differentiate between the multiple different MPI implementations that 
OSCAR contains.  You can trivially add a modulefile for Open MPI and 
therefore use the "switcher" command to easily switch between all the 
MPI implementations on your OSCAR cluster (once we hit 1.0, we 
anticipate having an OSCAR package).


I cannnot seeem to get a performance beyond 9 GFlops approximately. 
The block size
for mpich2 was 120 for best results. For open mpi for N = 22000 I have 
to use block sizes of 10 -11 to get a performance of 9GFlops other 
wise for larger block sizes(NB) it's worse. I used the same N=22000 
for mpich2 and have a 16 port Gigabit Netgear ethernet switch with 
Gigabit realtek8169 ethernet cards. Can any one tell me why the 
performance with open mpi is so low compared to mpich2-1.02p1?


There should clearly not be such a wide disparity in performance here; 
we don't see this kind of difference in our own internal testing.


Can you send the output of "ompi_info --all"?

--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/



[O-MPI users] OpenMPI hang issue

2005-10-17 Thread Parrott, Chris
Greetings,

I have been testing OpenMPI 1.0rc3 on a rack of 8 2-processor (single
core) Opteron systems connected via both Gigabit Ethernet and Myrinet.
My testing has been mostly successful, although I have run into a
recurring issue on a few MPI applications.  The symptom is that the
computation seems to progress nearly to completion, and then suddenly
just hangs without terminating.  One code that demonstrates this is the
Tachyon parallel raytracer, available at:

  http://jedi.ks.uiuc.edu/~johns/raytracer/

I am using PGI 6.0-5 to compile OpenMPI, so that may be part of the root
cause of this particular problem.

I have attached the output of config.log to this message.  Here is the
output from ompi_info:

Open MPI: 1.0rc3r7730
   Open MPI SVN revision: r7730
Open RTE: 1.0rc3r7730
   Open RTE SVN revision: r7730
OPAL: 1.0rc3r7730
   OPAL SVN revision: r7730
  Prefix: /opt/openmpi-1.0rc3-pgi-6.0
 Configured architecture: x86_64-unknown-linux-gnu
   Configured by: root
   Configured on: Mon Oct 17 10:10:28 PDT 2005
  Configure host: castor00
Built by: root
Built on: Mon Oct 17 10:29:20 PDT 2005
  Built host: castor00
  C bindings: yes
C++ bindings: yes
  Fortran77 bindings: yes (all)
  Fortran90 bindings: yes
  C compiler: pgcc
 C compiler absolute:
/net/lisbon/opt/pgi-6.0-5/linux86-64/6.0/bin/pgcc
C++ compiler: pgCC
   C++ compiler absolute:
/net/lisbon/opt/pgi-6.0-5/linux86-64/6.0/bin/pgCC
  Fortran77 compiler: pgf77
  Fortran77 compiler abs:
/net/lisbon/opt/pgi-6.0-5/linux86-64/6.0/bin/pgf77
  Fortran90 compiler: pgf90
  Fortran90 compiler abs:
/net/lisbon/opt/pgi-6.0-5/linux86-64/6.0/bin/pgf90
 C profiling: yes
   C++ profiling: yes
 Fortran77 profiling: yes
 Fortran90 profiling: yes
  C++ exceptions: no
  Thread support: posix (mpi: no, progress: no)
  Internal debug support: no
 MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
 libltdl support: 1
  MCA memory: malloc_hooks (MCA v1.0, API v1.0, Component
v1.0)
   MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.0)
   MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.0)
   MCA maffinity: libnuma (MCA v1.0, API v1.0, Component v1.0)
   MCA timer: linux (MCA v1.0, API v1.0, Component v1.0)
   MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
   MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.0)
MCA coll: self (MCA v1.0, API v1.0, Component v1.0)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.0)
  MCA io: romio (MCA v1.0, API v1.0, Component v1.0)
   MCA mpool: gm (MCA v1.0, API v1.0, Component v1.0)
   MCA mpool: sm (MCA v1.0, API v1.0, Component v1.0)
 MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.0)
 MCA pml: teg (MCA v1.0, API v1.0, Component v1.0)
 MCA pml: uniq (MCA v1.0, API v1.0, Component v1.0)
 MCA ptl: gm (MCA v1.0, API v1.0, Component v1.0)
 MCA ptl: self (MCA v1.0, API v1.0, Component v1.0)
 MCA ptl: sm (MCA v1.0, API v1.0, Component v1.0)
 MCA ptl: tcp (MCA v1.0, API v1.0, Component v1.0)
 MCA btl: gm (MCA v1.0, API v1.0, Component v1.0)
 MCA btl: self (MCA v1.0, API v1.0, Component v1.0)
 MCA btl: sm (MCA v1.0, API v1.0, Component v1.0)
 MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA topo: unity (MCA v1.0, API v1.0, Component v1.0)
 MCA gpr: null (MCA v1.0, API v1.0, Component v1.0)
 MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.0)
 MCA gpr: replica (MCA v1.0, API v1.0, Component v1.0)
 MCA iof: proxy (MCA v1.0, API v1.0, Component v1.0)
 MCA iof: svc (MCA v1.0, API v1.0, Component v1.0)
  MCA ns: proxy (MCA v1.0, API v1.0, Component v1.0)
  MCA ns: replica (MCA v1.0, API v1.0, Component v1.0)
 MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
 MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.0)
 MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.0)
 MCA ras: localhost (MCA v1.0, API v1.0, Component v1.0)
 MCA ras: slurm (MCA v1.0, API v1.0, Component v1.0)
 MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.0)
 MCA rds: resfile (MCA v1.0, API v1.0, Component v1.0)
   MCA rmaps: round_robin (MCA v1.0, API v1.0, Component
v1.0)
MCA rmgr: proxy (MCA v1.0, API v

Re: [O-MPI users] OpenMPI hang issue

2005-10-17 Thread Tim S. Woodall

Hello Chris,

Please give the next release candidate a try. There was an issue
w/ the GM port that was likely causing this.

Thanks,
Tim


Parrott, Chris wrote:

Greetings,

I have been testing OpenMPI 1.0rc3 on a rack of 8 2-processor (single
core) Opteron systems connected via both Gigabit Ethernet and Myrinet.
My testing has been mostly successful, although I have run into a
recurring issue on a few MPI applications.  The symptom is that the
computation seems to progress nearly to completion, and then suddenly
just hangs without terminating.  One code that demonstrates this is the
Tachyon parallel raytracer, available at:

  http://jedi.ks.uiuc.edu/~johns/raytracer/

I am using PGI 6.0-5 to compile OpenMPI, so that may be part of the root
cause of this particular problem.

I have attached the output of config.log to this message.  Here is the
output from ompi_info:

Open MPI: 1.0rc3r7730
   Open MPI SVN revision: r7730
Open RTE: 1.0rc3r7730
   Open RTE SVN revision: r7730
OPAL: 1.0rc3r7730
   OPAL SVN revision: r7730
  Prefix: /opt/openmpi-1.0rc3-pgi-6.0
 Configured architecture: x86_64-unknown-linux-gnu
   Configured by: root
   Configured on: Mon Oct 17 10:10:28 PDT 2005
  Configure host: castor00
Built by: root
Built on: Mon Oct 17 10:29:20 PDT 2005
  Built host: castor00
  C bindings: yes
C++ bindings: yes
  Fortran77 bindings: yes (all)
  Fortran90 bindings: yes
  C compiler: pgcc
 C compiler absolute:
/net/lisbon/opt/pgi-6.0-5/linux86-64/6.0/bin/pgcc
C++ compiler: pgCC
   C++ compiler absolute:
/net/lisbon/opt/pgi-6.0-5/linux86-64/6.0/bin/pgCC
  Fortran77 compiler: pgf77
  Fortran77 compiler abs:
/net/lisbon/opt/pgi-6.0-5/linux86-64/6.0/bin/pgf77
  Fortran90 compiler: pgf90
  Fortran90 compiler abs:
/net/lisbon/opt/pgi-6.0-5/linux86-64/6.0/bin/pgf90
 C profiling: yes
   C++ profiling: yes
 Fortran77 profiling: yes
 Fortran90 profiling: yes
  C++ exceptions: no
  Thread support: posix (mpi: no, progress: no)
  Internal debug support: no
 MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
 libltdl support: 1
  MCA memory: malloc_hooks (MCA v1.0, API v1.0, Component
v1.0)
   MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.0)
   MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.0)
   MCA maffinity: libnuma (MCA v1.0, API v1.0, Component v1.0)
   MCA timer: linux (MCA v1.0, API v1.0, Component v1.0)
   MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
   MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.0)
MCA coll: self (MCA v1.0, API v1.0, Component v1.0)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.0)
  MCA io: romio (MCA v1.0, API v1.0, Component v1.0)
   MCA mpool: gm (MCA v1.0, API v1.0, Component v1.0)
   MCA mpool: sm (MCA v1.0, API v1.0, Component v1.0)
 MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.0)
 MCA pml: teg (MCA v1.0, API v1.0, Component v1.0)
 MCA pml: uniq (MCA v1.0, API v1.0, Component v1.0)
 MCA ptl: gm (MCA v1.0, API v1.0, Component v1.0)
 MCA ptl: self (MCA v1.0, API v1.0, Component v1.0)
 MCA ptl: sm (MCA v1.0, API v1.0, Component v1.0)
 MCA ptl: tcp (MCA v1.0, API v1.0, Component v1.0)
 MCA btl: gm (MCA v1.0, API v1.0, Component v1.0)
 MCA btl: self (MCA v1.0, API v1.0, Component v1.0)
 MCA btl: sm (MCA v1.0, API v1.0, Component v1.0)
 MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA topo: unity (MCA v1.0, API v1.0, Component v1.0)
 MCA gpr: null (MCA v1.0, API v1.0, Component v1.0)
 MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.0)
 MCA gpr: replica (MCA v1.0, API v1.0, Component v1.0)
 MCA iof: proxy (MCA v1.0, API v1.0, Component v1.0)
 MCA iof: svc (MCA v1.0, API v1.0, Component v1.0)
  MCA ns: proxy (MCA v1.0, API v1.0, Component v1.0)
  MCA ns: replica (MCA v1.0, API v1.0, Component v1.0)
 MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
 MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.0)
 MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.0)
 MCA ras: localhost (MCA v1.0, API v1.0, Component v1.0)
 MCA ras: slurm (MCA v1.0, API v1.0, Component v1.0)
 MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.0)
 MCA rds: resfi

Re: [O-MPI users] Hpl Bench mark and Openmpi rc3 (Jeff Squyres)

2005-10-17 Thread Allan Menezes



users-requ...@open-mpi.org wrote:


Send users mailing list submissions to
us...@open-mpi.org

Today's Topics:

  1. Re: Hpl Bench mark and Openmpi rc3 (Jeff Squyres)


--

Message: 1
Date: Mon, 17 Oct 2005 10:16:39 -0400
From: Jeff Squyres 
Subject: Re: [O-MPI users] Hpl Bench mark and Openmpi rc3
To: Open MPI Users 
Message-ID: <8557a377fe1f131e23274e10e5f6e...@open-mpi.org>
Content-Type: text/plain; charset=US-ASCII; format=flowed

On Oct 13, 2005, at 1:25 AM, Allan Menezes wrote:

 

  I have a 16 node cluster of x86 machines with FC3 running on the 
head

node. I used a beta version of OSCAR 4.2 for putting together the
cluster. It uses /home/allan as the NFS directory.
   



Greetings Allan.  Sorry for the delay in replying -- we were all at an 
Open MPI working meeting last week, and the schedule got a bit hectic.


Your setup sounds find.

 


I tried Mpich2v1.02p1 and got abench mark of 26GFlops for it approx.
WIth open mpi 1.0RC3 having set the LD_LIBRARY_PATH in .bashrc and the
/opt/openmpi/bin path in .bash_profile in the home directory
   



Two quick notes here:

- Open MPI's mpirun supports the "--prefix" option which obviates 
needing to set these variables in your .bashrc (although setting them 
in permanently makes things easier in the long term -- you don't need 
to always specify --prefix).  See the FAQ for more details on the 
--prefix option:


http://www.open-mpi.org/faq/?category=running#mpirun-prefix

- OSCAR makes use of environment modules; it contains setup to 
differentiate between the multiple different MPI implementations that 
OSCAR contains.  You can trivially add a modulefile for Open MPI and 
therefore use the "switcher" command to easily switch between all the 
MPI implementations on your OSCAR cluster (once we hit 1.0, we 
anticipate having an OSCAR package).


 

I cannnot seeem to get a performance beyond 9 GFlops approximately. 
The block size
for mpich2 was 120 for best results. For open mpi for N = 22000 I have 
to use block sizes of 10 -11 to get a performance of 9GFlops other 
wise for larger block sizes(NB) it's worse. I used the same N=22000 
for mpich2 and have a 16 port Gigabit Netgear ethernet switch with 
Gigabit realtek8169 ethernet cards. Can any one tell me why the 
performance with open mpi is so low compared to mpich2-1.02p1?
   



There should clearly not be such a wide disparity in performance here; 
we don't see this kind of difference in our own internal testing.


Can you send the output of "ompi_info --all"?

 



Hi Jeff,
  I installed two versions of open mpi slightly different. One on 
/opt/openmpi or I would get the gfortran error and the other in 
/home/allan/openmpi
However I do not think that is the problem as the path names are 
specified in the bahrc and bash_profile files of the /home/allan directory.
I also log into user allan who is not a superuser.On running the open 
mpi with HPL I use the following command line:
a1> mpirun -mca pls_rsh_orted /home/allan/openmpi/bin/orted -hostfile aa 
-np 16 ./xhpl
from the directory where xhpl resides such as /homer/open/bench and I 
use the -mca command pls_rsh_orted as it otherwise comes up with an 
error that it cannot find the ORTED daemon on machines a1, a2 etc. That 
is probaly aconfiguration error. However the commands above and the 
setup described works fine and there are no errors in the HPL.out file, 
except that it is slow.
I use an atlas BLAS library for creating xhpl from hpl.tar.gz. The make 
file for hpl uses the atlas libs and the open mpi mpicc compiler for 
both compilation and linking. and I have zeroed out the MPI macro paths 
in Make.open(that's what I reanmed the hpl makefile) for make arch=open 
in hpl directory. Please find attached the ompi_info -all file as 
requested. Thank you very much:

Allan


Open MPI: 1.0rc3r7730
   Open MPI SVN revision: r7730
Open RTE: 1.0rc3r7730
   Open RTE SVN revision: r7730
OPAL: 1.0rc3r7730
   OPAL SVN revision: r7730
  MCA memory: malloc_hooks (MCA v1.0, API v1.0, Component v1.0)
   MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.0)
   MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.0)
   MCA timer: linux (MCA v1.0, API v1.0, Component v1.0)
   MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
   MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.0)
MCA coll: self (MCA v1.0, API v1.0, Component v1.0)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.0)
  MCA io: romio (MCA v1.0, API v1.0, Component v1.0)
   MCA mpool: sm (MCA v1.0, API v1.0, Component v1.0)
 MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.0)
 MCA pml: teg (MCA v1.0, API v1.0, Component v1.