On Aug 21, 2006, at 6:10 PM, Dave Grote wrote:

I have attached a small program that when run on my machine produces the error message below and locks up.

[node0000:06319] [mpool_gm_module.c:100] error(8) registering gm memory

I get the error when I run with 32 processors, but not with 4 (even if I increase the loop count to 20000). This is on a cluster of dual-dual core opterons with myrinet switches (i.e. using the gm routines). Unfortunately, I don't have the configure options that were used to build openmpi, but I don't think there was anything unusual. I've also attached the open_info output. Here is the compile line for the code

g95 -o allreducetest allreducetest.F -I/usr/local/ompi/1.1-gcc/ include -L/usr/local/ompi/1.1-gcc/lib -lmpi

Also note that I did have to make changes to the fortran include files in openmpi to force all of the integers to be of size 4 (i.e. declaring them integer(4)) since the default integer size used by g95 is 8 bytes but the openmpi fortran interface was compiled with f77 which uses 4 byte integers.

Any suggestions on what to look for?

I believe you are running into a known issue with Open MPI 1.1. Can you try the 1.1.1 pre-release available on our web page:

  http://www.open-mpi.org/software/ompi/v1.1/

As for the Fortran fixes, the solution is to compile Open MPI with the same Fortran compiler you use for compiling your application. While g77 and g95 are somewhat close in compatibility, this usually isn't the case, so we don't try to be compatible across multiple Fortran compilers.

Brian


--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/


Reply via email to