Terry -
Would you be willing to do an experiment with the memory allocator?
There are two values we change to try to make IB run faster (at the
cost of corner cases you're hitting). I'm not sure one is strictly
necessary, and I'm concerned that it's the one causing problems. If
you don't mind recompiling again, would you change line 64 in opal/mca/
memory/ptmalloc2/malloc.c from:
#define DEFAULT_MMAP_THRESHOLD (2*1024*1024)
to:
#define DEFAULT_MMAP_THRESHOLD (128*1024)
And then recompile with the memory manager, obviously. That will make
the mmap / sbrk cross-over point the same as the default allocator in
Linux. There's still one other tweak we do, but I'm almost 100%
positive it's the threshold causing problems.
Brian
On May 19, 2008, at 8:17 PM, Terry Frankcombe wrote:
To tell you all what noone wanted to tell me, yes, it does seem to be
the memory manager. Compiling everything with
--with-memory-manager=none returns the vmem use to the more reasonable
~100MB per process (down from >8GB).
I take it this may affect my peak bandwidth over infiniband. What's
the
general feeling about how bad this is?
On Tue, 2008-05-13 at 13:12 +1000, Terry Frankcombe wrote:
Hi folks
I'm trying to run an MPI app on an infiniband cluster with OpenMPI
1.2.6.
When run on a single node, this app is grabbing large chunks of
memory
(total per process ~8.5GB, including strace showing a single 4GB
grab)
but not using it. The resident memory use is ~40MB per process.
When
this app is compiled in serial mode (with conditionals to remove
the MPI
calls) the memory use is more like what you'd expect, 40MB res and
~100MB vmem.
Now I didn't write it so I'm not sure what extra stuff the MPI
version
does, and we haven't tracked down the large memory grabs.
Could it be that this vmem is being grabbed by the OpenMPI memory
manager rather than directly by the app?
Ciao
Terry
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Brian Barrett
Open MPI developer
http://www.open-mpi.org/