On Aug 21, 2006, at 6:10 PM, Dave Grote wrote:
I have attached a small program that when run on my machine
produces the error message below and locks up.
[node0000:06319] [mpool_gm_module.c:100] error(8) registering gm
memory
I get the error when I run with 32 processors, but not with 4 (even
if I increase the loop count to 20000). This is on a cluster of
dual-dual core opterons with myrinet switches (i.e. using the gm
routines). Unfortunately, I don't have the configure options that
were used to build openmpi, but I don't think there was anything
unusual. I've also attached the open_info output. Here is the
compile line for the code
g95 -o allreducetest allreducetest.F -I/usr/local/ompi/1.1-gcc/
include -L/usr/local/ompi/1.1-gcc/lib -lmpi
Also note that I did have to make changes to the fortran include
files in openmpi to force all of the integers to be of size 4 (i.e.
declaring them integer(4)) since the default integer size used by
g95 is 8 bytes but the openmpi fortran interface was compiled with
f77 which uses 4 byte integers.
Any suggestions on what to look for?
I believe you are running into a known issue with Open MPI 1.1. Can
you try the 1.1.1 pre-release available on our web page:
http://www.open-mpi.org/software/ompi/v1.1/
As for the Fortran fixes, the solution is to compile Open MPI with
the same Fortran compiler you use for compiling your application.
While g77 and g95 are somewhat close in compatibility, this usually
isn't the case, so we don't try to be compatible across multiple
Fortran compilers.
Brian
--
Brian Barrett
Open MPI developer
http://www.open-mpi.org/