[OMPI users] Problem running hpcc with a threaded BLAS

2007-04-27 Thread Götz Waschk

Hello everyone,

I'm testing my new cluster installation with the hpcc benchmark and
openmpi 1.2.1 on RHEL5 32 bit. I have some trouble with using a
threaded BLAS implementation. I have tried ATLAS 3.7.30 compiled with
pthread support. It crashes as reported here:
http://sourceforge.net/tracker/index.php?func=detail&aid=1708575&group_id=23725&atid=379483

If I link to the ATLAS version without pthread support hpcc runs fine.

I have a problem with Goto BLAS 1.14 too, the output of hpcc stops
before the HPL run, then the hpcc processes seem to do nothing,
consuming 100% CPU. If I set the maximum number of threads for Goto
BLAS to 1, hpcc is working fine again.

openmpi was compiled without thread support.

Can you give me a hint?

Regards, Götz Waschk

--
AL I:40: Do what thou wilt shall be the whole of the Law.



[OMPI users] [PATCH] small build fix for gm btl

2007-04-27 Thread Götz Waschk

Hello everyone,

I've found a bug trying to build openmpi 1.2.1 with progress threads
and gm btl support. Gcc had no problem with the missing header but
pgcc 7.0 complained. Check the attached patch.

Regards, Götz Waschk

--
AL I:40: Do what thou wilt shall be the whole of the Law.
--- openmpi-1.2.1/ompi/mca/btl/gm/btl_gm_component.c~	2007-04-19 18:30:53.0 +0200
+++ openmpi-1.2.1/ompi/mca/btl/gm/btl_gm_component.c	2007-04-27 14:50:04.0 +0200
@@ -45,6 +45,8 @@
 
 
 #if OMPI_ENABLE_PROGRESS_THREADS
+#include 
+
 static void* mca_btl_gm_progress_thread( opal_object_t* arg );
 #endif
 static int gm_reg_mr(void *reg_data, void *base, size_t size,


Re: [OMPI users] Compile WRFV2.2 with OpenMPI

2007-04-27 Thread Jeff Squyres
This is quite odd; we have tested OMPI 1.1.x with the intel compilers  
quite a bit.  In particular, it seems to be complaining about  
MPI_Fint and MPI_Comm, but these two types should have been  
typedef'ed earlier in mpi.h.


Can you send along the information listed on the "Getting Help" page  
on the web site, and also include your mpi.h file?


Thanks!



On Apr 26, 2007, at 5:28 PM, Jiming Jin wrote:


Dear Users:

 I have been trying to use the intel ifort and icc compilers to  
compile an atmospheric model called the Weather Research &  
Forecasting model (WRFV2.2) on a Linux Cluster (x86_64) using Open- 
MPI v1.2 that were also compiled with INTEL ICC.   However, I got a  
lot of error messages as follows when compiling WRF.
/data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(788):  
error: expected an identifier

  OMPI_DECLSPEC  MPI_Fint MPI_Comm_c2f(MPI_Comm comm);
  ^
/data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(802):  
error: "MPI_Comm" has already been declared in the current scope

  OMPI_DECLSPEC  MPI_Comm MPI_Comm_f2c(MPI_Fint comm);
  ^
/data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(804):  
error: function "MPI_Comm" is not a type name

  OMPI_DECLSPEC  int MPI_Comm_free(MPI_Comm *comm);
   ^
/data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(805):  
error: function "MPI_Comm" is not a type name

  OMPI_DECLSPEC  int MPI_Comm_get_attr(MPI_Comm comm, int comm_keyval,
   ^
/data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(807):  
error: function "MPI_Comm" is not a type name
  OMPI_DECLSPEC  int MPI_Comm_get_errhandler(MPI_Comm comm,  
MPI_Errhandler *erhandler);


I would highly appreciate it if someone could give me suggestions  
on how to fix the problem.


Jiming
--
Jiming Jin, PhD
Earth Sciences Division
Lawrence Berkeley National Lab
One Cyclotron Road, Mail-Stop 90-1116
Berkeley, CA 94720
Tel: 510-486-7551
Fax: 510-486-5686



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Compile WRFV2.2 with OpenMPI

2007-04-27 Thread Jeff Squyres
This is quite odd; we have tested OMPI 1.1.x with the intel compilers  
quite a bit.  In particular, it seems to be complaining about  
MPI_Fint and MPI_Comm, but these two types should have been  
typedef'ed earlier in mpi.h.


Can you send along the information listed on the "Getting Help" page  
on the web site, and also include your mpi.h file?


Thanks!



On Apr 26, 2007, at 5:28 PM, Jiming Jin wrote:


Dear Users:

 I have been trying to use the intel ifort and icc compilers to  
compile an atmospheric model called the Weather Research &  
Forecasting model (WRFV2.2) on a Linux Cluster (x86_64) using Open- 
MPI v1.2 that were also compiled with INTEL ICC.   However, I got a  
lot of error messages as follows when compiling WRF.
/data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(788):  
error: expected an identifier

  OMPI_DECLSPEC  MPI_Fint MPI_Comm_c2f(MPI_Comm comm);
  ^
/data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(802):  
error: "MPI_Comm" has already been declared in the current scope

  OMPI_DECLSPEC  MPI_Comm MPI_Comm_f2c(MPI_Fint comm);
  ^
/data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(804):  
error: function "MPI_Comm" is not a type name

  OMPI_DECLSPEC  int MPI_Comm_free(MPI_Comm *comm);
   ^
/data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(805):  
error: function "MPI_Comm" is not a type name

  OMPI_DECLSPEC  int MPI_Comm_get_attr(MPI_Comm comm, int comm_keyval,
   ^
/data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(807):  
error: function "MPI_Comm" is not a type name
  OMPI_DECLSPEC  int MPI_Comm_get_errhandler(MPI_Comm comm,  
MPI_Errhandler *erhandler);


I would highly appreciate it if someone could give me suggestions  
on how to fix the problem.


Jiming
--
Jiming Jin, PhD
Earth Sciences Division
Lawrence Berkeley National Lab
One Cyclotron Road, Mail-Stop 90-1116
Berkeley, CA 94720
Tel: 510-486-7551
Fax: 510-486-5686



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Compile WRFV2.2 with OpenMPI

2007-04-27 Thread Daniel Gruner
>From Jiming's error messages, it seems that he is using 1.1 libraries
and header files, while supposedly compiling for ompi 1.2, 
therefore causing undefined stuff.  Am I wrong in this assessment?

Daniel


On Fri, Apr 27, 2007 at 08:03:34AM -0400, Jeff Squyres wrote:
> This is quite odd; we have tested OMPI 1.1.x with the intel compilers  
> quite a bit.  In particular, it seems to be complaining about  
> MPI_Fint and MPI_Comm, but these two types should have been  
> typedef'ed earlier in mpi.h.
> 
> Can you send along the information listed on the "Getting Help" page  
> on the web site, and also include your mpi.h file?
> 
> Thanks!
> 
> 
> 
> On Apr 26, 2007, at 5:28 PM, Jiming Jin wrote:
> 
> > Dear Users:
> >
> >  I have been trying to use the intel ifort and icc compilers to  
> > compile an atmospheric model called the Weather Research &  
> > Forecasting model (WRFV2.2) on a Linux Cluster (x86_64) using Open- 
> > MPI v1.2 that were also compiled with INTEL ICC.   However, I got a  
> > lot of error messages as follows when compiling WRF.
> > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(788):  
> > error: expected an identifier
> >   OMPI_DECLSPEC  MPI_Fint MPI_Comm_c2f(MPI_Comm comm);
> >   ^
> > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(802):  
> > error: "MPI_Comm" has already been declared in the current scope
> >   OMPI_DECLSPEC  MPI_Comm MPI_Comm_f2c(MPI_Fint comm);
> >   ^
> > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(804):  
> > error: function "MPI_Comm" is not a type name
> >   OMPI_DECLSPEC  int MPI_Comm_free(MPI_Comm *comm);
> >^
> > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(805):  
> > error: function "MPI_Comm" is not a type name
> >   OMPI_DECLSPEC  int MPI_Comm_get_attr(MPI_Comm comm, int comm_keyval,
> >^
> > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(807):  
> > error: function "MPI_Comm" is not a type name
> >   OMPI_DECLSPEC  int MPI_Comm_get_errhandler(MPI_Comm comm,  
> > MPI_Errhandler *erhandler);
> >
> > I would highly appreciate it if someone could give me suggestions  
> > on how to fix the problem.
> >
> > Jiming
> > --
> > Jiming Jin, PhD
> > Earth Sciences Division
> > Lawrence Berkeley National Lab
> > One Cyclotron Road, Mail-Stop 90-1116
> > Berkeley, CA 94720
> > Tel: 510-486-7551
> > Fax: 510-486-5686
> >
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> Cisco Systems
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 

Dr. Daniel Grunerdgru...@chem.utoronto.ca
Dept. of Chemistry   daniel.gru...@utoronto.ca
University of Torontophone:  (416)-978-8689
80 St. George Street fax:(416)-978-5325
Toronto, ON  M5S 3H6, Canada finger for PGP public key


Re: [OMPI users] bproc problems

2007-04-27 Thread Daniel Gruner
Thanks to both you and David Gunter.  I disabled pty support and
it now works.  

There is still the issue of the mpirun default being "-byslot", which
causes all kinds of trouble.  Only by using "-bynode" do things work
properly.

Daniel

On Thu, Apr 26, 2007 at 02:28:33PM -0600, gshipman wrote:
> There is a known issue on BProc 4 w.r.t. pty support. Open MPI by  
> default will try to use ptys for I/O forwarding but will revert to  
> pipes if ptys are not available.
> 
> You can "safely" ignore the pty warnings, or you may want to rerun  
> configure and add:
> --disable-pty-support
> 
> I say "safely" because my understanding is that some I/O data may be  
> lost if pipes are used during abnormal termination.
> 
> Alternatively you might try getting pty support working, you need to  
> configure ptys on the backend nodes.
> You can then try the following code to test if it is working  
> correctly, if this fails (it does on our BProc 4 cluster) you  
> shouldn't use ptys on BProc.
> 
> 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> int
> main(int argc, char *agrv[])
> {
>int amaster, aslave;
> 
>if (openpty(&amaster, &aslave, NULL, NULL, NULL) < 0) {
>  printf("openpty() failed with errno = %d, %s\n", errno, strerror 
> (errno));
>} else {
>  printf("openpty() succeeded\n");
>}
> 
>return 0;
> }
> 
> 
> 
> 
> 
> 
> On Apr 26, 2007, at 2:06 PM, Daniel Gruner wrote:
> 
> > Hi
> >
> > I have been testing OpenMPI 1.2, and now 1.2.1, on several BProc-
> > based clusters, and I have found some problems/issues.  All my
> > clusters have standard ethernet interconnects, either 100Base/T or
> > Gigabit, on standard switches.
> >
> > The clusters are all running Clustermatic 5 (BProc 4.x), and range
> > from 32-bit Athlon, to 32-bit Xeon, to 64-bit Opteron.  In all cases
> > the same problems occur, identically.  I attach here the results
> > from "ompi_info --all" and the config.log, for my latest build on
> > an Opteron cluster, using the Pathscale compilers.  I had exactly
> > the same problems when using the vanilla GNU compilers.
> >
> > Now for a description of the problem:
> >
> > When running an mpi code (cpi.c, from the standard mpi examples, also
> > attached), using the mpirun defaults (e.g. -byslot), with a single
> > process:
> >
> > sonoma:dgruner{134}> mpirun -n 1 ./cpip
> > [n17:30019] odls_bproc: openpty failed, using pipes instead
> > Process 0 on n17
> > pi is approximately 3.1415926544231341, Error is 0.08333410
> > wall clock time = 0.000199
> >
> > However, if one tries to run more than one process, this bombs:
> >
> > sonoma:dgruner{134}> mpirun -n 2 ./cpip
> > .
> > .
> > .
> > [n21:30029] OOB: Connection to HNP lost
> > [n21:30029] OOB: Connection to HNP lost
> > [n21:30029] OOB: Connection to HNP lost
> > [n21:30029] OOB: Connection to HNP lost
> > [n21:30029] OOB: Connection to HNP lost
> > [n21:30029] OOB: Connection to HNP lost
> > .
> > . ad infinitum
> >
> > If one uses de option "-bynode", things work:
> >
> > sonoma:dgruner{145}> mpirun -bynode -n 2 ./cpip
> > [n17:30055] odls_bproc: openpty failed, using pipes instead
> > Process 0 on n17
> > Process 1 on n21
> > pi is approximately 3.1415926544231318, Error is 0.0887
> > wall clock time = 0.010375
> >
> >
> > Note that there is always the message about "openpty failed, using  
> > pipes instead".
> >
> > If I run more processes (on my 3-node cluster, with 2 cpus per  
> > node), the
> > openpty message appears repeatedly for the first node:
> >
> > sonoma:dgruner{146}> mpirun -bynode -n 6 ./cpip
> > [n17:30061] odls_bproc: openpty failed, using pipes instead
> > [n17:30061] odls_bproc: openpty failed, using pipes instead
> > Process 0 on n17
> > Process 2 on n49
> > Process 1 on n21
> > Process 5 on n49
> > Process 3 on n17
> > Process 4 on n21
> > pi is approximately 3.1415926544231239, Error is 0.0807
> > wall clock time = 0.050332
> >
> >
> > Should I worry about the openpty failure?  I suspect that  
> > communications
> > may be slower this way.  Using the -byslot option always fails, so  
> > this
> > is a bug.  The same occurs for all the codes that I have tried,  
> > both simple
> > and complex.
> >
> > Thanks for your attention to this.
> > Regards,
> > Daniel
> > -- 
> >
> > Dr. Daniel Grunerdgru...@chem.utoronto.ca
> > Dept. of Chemistry   daniel.gru...@utoronto.ca
> > University of Torontophone:  (416)-978-8689
> > 80 St. George Street fax:(416)-978-5325
> > Toronto, ON  M5S 3H6, Canada finger for PGP public key
> > 
> > 
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
>