[OMPI users] Problem running hpcc with a threaded BLAS
Hello everyone, I'm testing my new cluster installation with the hpcc benchmark and openmpi 1.2.1 on RHEL5 32 bit. I have some trouble with using a threaded BLAS implementation. I have tried ATLAS 3.7.30 compiled with pthread support. It crashes as reported here: http://sourceforge.net/tracker/index.php?func=detail&aid=1708575&group_id=23725&atid=379483 If I link to the ATLAS version without pthread support hpcc runs fine. I have a problem with Goto BLAS 1.14 too, the output of hpcc stops before the HPL run, then the hpcc processes seem to do nothing, consuming 100% CPU. If I set the maximum number of threads for Goto BLAS to 1, hpcc is working fine again. openmpi was compiled without thread support. Can you give me a hint? Regards, Götz Waschk -- AL I:40: Do what thou wilt shall be the whole of the Law.
[OMPI users] [PATCH] small build fix for gm btl
Hello everyone, I've found a bug trying to build openmpi 1.2.1 with progress threads and gm btl support. Gcc had no problem with the missing header but pgcc 7.0 complained. Check the attached patch. Regards, Götz Waschk -- AL I:40: Do what thou wilt shall be the whole of the Law. --- openmpi-1.2.1/ompi/mca/btl/gm/btl_gm_component.c~ 2007-04-19 18:30:53.0 +0200 +++ openmpi-1.2.1/ompi/mca/btl/gm/btl_gm_component.c 2007-04-27 14:50:04.0 +0200 @@ -45,6 +45,8 @@ #if OMPI_ENABLE_PROGRESS_THREADS +#include + static void* mca_btl_gm_progress_thread( opal_object_t* arg ); #endif static int gm_reg_mr(void *reg_data, void *base, size_t size,
Re: [OMPI users] Compile WRFV2.2 with OpenMPI
This is quite odd; we have tested OMPI 1.1.x with the intel compilers quite a bit. In particular, it seems to be complaining about MPI_Fint and MPI_Comm, but these two types should have been typedef'ed earlier in mpi.h. Can you send along the information listed on the "Getting Help" page on the web site, and also include your mpi.h file? Thanks! On Apr 26, 2007, at 5:28 PM, Jiming Jin wrote: Dear Users: I have been trying to use the intel ifort and icc compilers to compile an atmospheric model called the Weather Research & Forecasting model (WRFV2.2) on a Linux Cluster (x86_64) using Open- MPI v1.2 that were also compiled with INTEL ICC. However, I got a lot of error messages as follows when compiling WRF. /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(788): error: expected an identifier OMPI_DECLSPEC MPI_Fint MPI_Comm_c2f(MPI_Comm comm); ^ /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(802): error: "MPI_Comm" has already been declared in the current scope OMPI_DECLSPEC MPI_Comm MPI_Comm_f2c(MPI_Fint comm); ^ /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(804): error: function "MPI_Comm" is not a type name OMPI_DECLSPEC int MPI_Comm_free(MPI_Comm *comm); ^ /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(805): error: function "MPI_Comm" is not a type name OMPI_DECLSPEC int MPI_Comm_get_attr(MPI_Comm comm, int comm_keyval, ^ /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(807): error: function "MPI_Comm" is not a type name OMPI_DECLSPEC int MPI_Comm_get_errhandler(MPI_Comm comm, MPI_Errhandler *erhandler); I would highly appreciate it if someone could give me suggestions on how to fix the problem. Jiming -- Jiming Jin, PhD Earth Sciences Division Lawrence Berkeley National Lab One Cyclotron Road, Mail-Stop 90-1116 Berkeley, CA 94720 Tel: 510-486-7551 Fax: 510-486-5686 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] Compile WRFV2.2 with OpenMPI
This is quite odd; we have tested OMPI 1.1.x with the intel compilers quite a bit. In particular, it seems to be complaining about MPI_Fint and MPI_Comm, but these two types should have been typedef'ed earlier in mpi.h. Can you send along the information listed on the "Getting Help" page on the web site, and also include your mpi.h file? Thanks! On Apr 26, 2007, at 5:28 PM, Jiming Jin wrote: Dear Users: I have been trying to use the intel ifort and icc compilers to compile an atmospheric model called the Weather Research & Forecasting model (WRFV2.2) on a Linux Cluster (x86_64) using Open- MPI v1.2 that were also compiled with INTEL ICC. However, I got a lot of error messages as follows when compiling WRF. /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(788): error: expected an identifier OMPI_DECLSPEC MPI_Fint MPI_Comm_c2f(MPI_Comm comm); ^ /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(802): error: "MPI_Comm" has already been declared in the current scope OMPI_DECLSPEC MPI_Comm MPI_Comm_f2c(MPI_Fint comm); ^ /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(804): error: function "MPI_Comm" is not a type name OMPI_DECLSPEC int MPI_Comm_free(MPI_Comm *comm); ^ /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(805): error: function "MPI_Comm" is not a type name OMPI_DECLSPEC int MPI_Comm_get_attr(MPI_Comm comm, int comm_keyval, ^ /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(807): error: function "MPI_Comm" is not a type name OMPI_DECLSPEC int MPI_Comm_get_errhandler(MPI_Comm comm, MPI_Errhandler *erhandler); I would highly appreciate it if someone could give me suggestions on how to fix the problem. Jiming -- Jiming Jin, PhD Earth Sciences Division Lawrence Berkeley National Lab One Cyclotron Road, Mail-Stop 90-1116 Berkeley, CA 94720 Tel: 510-486-7551 Fax: 510-486-5686 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] Compile WRFV2.2 with OpenMPI
>From Jiming's error messages, it seems that he is using 1.1 libraries and header files, while supposedly compiling for ompi 1.2, therefore causing undefined stuff. Am I wrong in this assessment? Daniel On Fri, Apr 27, 2007 at 08:03:34AM -0400, Jeff Squyres wrote: > This is quite odd; we have tested OMPI 1.1.x with the intel compilers > quite a bit. In particular, it seems to be complaining about > MPI_Fint and MPI_Comm, but these two types should have been > typedef'ed earlier in mpi.h. > > Can you send along the information listed on the "Getting Help" page > on the web site, and also include your mpi.h file? > > Thanks! > > > > On Apr 26, 2007, at 5:28 PM, Jiming Jin wrote: > > > Dear Users: > > > > I have been trying to use the intel ifort and icc compilers to > > compile an atmospheric model called the Weather Research & > > Forecasting model (WRFV2.2) on a Linux Cluster (x86_64) using Open- > > MPI v1.2 that were also compiled with INTEL ICC. However, I got a > > lot of error messages as follows when compiling WRF. > > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(788): > > error: expected an identifier > > OMPI_DECLSPEC MPI_Fint MPI_Comm_c2f(MPI_Comm comm); > > ^ > > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(802): > > error: "MPI_Comm" has already been declared in the current scope > > OMPI_DECLSPEC MPI_Comm MPI_Comm_f2c(MPI_Fint comm); > > ^ > > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(804): > > error: function "MPI_Comm" is not a type name > > OMPI_DECLSPEC int MPI_Comm_free(MPI_Comm *comm); > >^ > > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(805): > > error: function "MPI_Comm" is not a type name > > OMPI_DECLSPEC int MPI_Comm_get_attr(MPI_Comm comm, int comm_keyval, > >^ > > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(807): > > error: function "MPI_Comm" is not a type name > > OMPI_DECLSPEC int MPI_Comm_get_errhandler(MPI_Comm comm, > > MPI_Errhandler *erhandler); > > > > I would highly appreciate it if someone could give me suggestions > > on how to fix the problem. > > > > Jiming > > -- > > Jiming Jin, PhD > > Earth Sciences Division > > Lawrence Berkeley National Lab > > One Cyclotron Road, Mail-Stop 90-1116 > > Berkeley, CA 94720 > > Tel: 510-486-7551 > > Fax: 510-486-5686 > > > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > Cisco Systems > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dr. Daniel Grunerdgru...@chem.utoronto.ca Dept. of Chemistry daniel.gru...@utoronto.ca University of Torontophone: (416)-978-8689 80 St. George Street fax:(416)-978-5325 Toronto, ON M5S 3H6, Canada finger for PGP public key
Re: [OMPI users] bproc problems
Thanks to both you and David Gunter. I disabled pty support and it now works. There is still the issue of the mpirun default being "-byslot", which causes all kinds of trouble. Only by using "-bynode" do things work properly. Daniel On Thu, Apr 26, 2007 at 02:28:33PM -0600, gshipman wrote: > There is a known issue on BProc 4 w.r.t. pty support. Open MPI by > default will try to use ptys for I/O forwarding but will revert to > pipes if ptys are not available. > > You can "safely" ignore the pty warnings, or you may want to rerun > configure and add: > --disable-pty-support > > I say "safely" because my understanding is that some I/O data may be > lost if pipes are used during abnormal termination. > > Alternatively you might try getting pty support working, you need to > configure ptys on the backend nodes. > You can then try the following code to test if it is working > correctly, if this fails (it does on our BProc 4 cluster) you > shouldn't use ptys on BProc. > > > #include > #include > #include > #include > #include > > int > main(int argc, char *agrv[]) > { >int amaster, aslave; > >if (openpty(&amaster, &aslave, NULL, NULL, NULL) < 0) { > printf("openpty() failed with errno = %d, %s\n", errno, strerror > (errno)); >} else { > printf("openpty() succeeded\n"); >} > >return 0; > } > > > > > > > On Apr 26, 2007, at 2:06 PM, Daniel Gruner wrote: > > > Hi > > > > I have been testing OpenMPI 1.2, and now 1.2.1, on several BProc- > > based clusters, and I have found some problems/issues. All my > > clusters have standard ethernet interconnects, either 100Base/T or > > Gigabit, on standard switches. > > > > The clusters are all running Clustermatic 5 (BProc 4.x), and range > > from 32-bit Athlon, to 32-bit Xeon, to 64-bit Opteron. In all cases > > the same problems occur, identically. I attach here the results > > from "ompi_info --all" and the config.log, for my latest build on > > an Opteron cluster, using the Pathscale compilers. I had exactly > > the same problems when using the vanilla GNU compilers. > > > > Now for a description of the problem: > > > > When running an mpi code (cpi.c, from the standard mpi examples, also > > attached), using the mpirun defaults (e.g. -byslot), with a single > > process: > > > > sonoma:dgruner{134}> mpirun -n 1 ./cpip > > [n17:30019] odls_bproc: openpty failed, using pipes instead > > Process 0 on n17 > > pi is approximately 3.1415926544231341, Error is 0.08333410 > > wall clock time = 0.000199 > > > > However, if one tries to run more than one process, this bombs: > > > > sonoma:dgruner{134}> mpirun -n 2 ./cpip > > . > > . > > . > > [n21:30029] OOB: Connection to HNP lost > > [n21:30029] OOB: Connection to HNP lost > > [n21:30029] OOB: Connection to HNP lost > > [n21:30029] OOB: Connection to HNP lost > > [n21:30029] OOB: Connection to HNP lost > > [n21:30029] OOB: Connection to HNP lost > > . > > . ad infinitum > > > > If one uses de option "-bynode", things work: > > > > sonoma:dgruner{145}> mpirun -bynode -n 2 ./cpip > > [n17:30055] odls_bproc: openpty failed, using pipes instead > > Process 0 on n17 > > Process 1 on n21 > > pi is approximately 3.1415926544231318, Error is 0.0887 > > wall clock time = 0.010375 > > > > > > Note that there is always the message about "openpty failed, using > > pipes instead". > > > > If I run more processes (on my 3-node cluster, with 2 cpus per > > node), the > > openpty message appears repeatedly for the first node: > > > > sonoma:dgruner{146}> mpirun -bynode -n 6 ./cpip > > [n17:30061] odls_bproc: openpty failed, using pipes instead > > [n17:30061] odls_bproc: openpty failed, using pipes instead > > Process 0 on n17 > > Process 2 on n49 > > Process 1 on n21 > > Process 5 on n49 > > Process 3 on n17 > > Process 4 on n21 > > pi is approximately 3.1415926544231239, Error is 0.0807 > > wall clock time = 0.050332 > > > > > > Should I worry about the openpty failure? I suspect that > > communications > > may be slower this way. Using the -byslot option always fails, so > > this > > is a bug. The same occurs for all the codes that I have tried, > > both simple > > and complex. > > > > Thanks for your attention to this. > > Regards, > > Daniel > > -- > > > > Dr. Daniel Grunerdgru...@chem.utoronto.ca > > Dept. of Chemistry daniel.gru...@utoronto.ca > > University of Torontophone: (416)-978-8689 > > 80 St. George Street fax:(416)-978-5325 > > Toronto, ON M5S 3H6, Canada finger for PGP public key > > > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > >