Terry -

It tells us that I'm not as smart as I thought :). If you're willing to help track this down, I'd like to try some other things that will require a more involved patch (it'll take me a day or two to get the patch right). Let me know if you'd be wiling to look further (hopefully only another build or two) and I'll put the patch together.

Brian


On Wed, 21 May 2008, Terry Frankcombe wrote:

Hi Brian

I ran your experiment.  Changing the MMAP threshold made no difference
to the memory footprint (>8GB/process out of the box, an order of
magnitude smaller with --with-memory-manager=none).

What does that tell us?

Ciao
Terry



On Tue, 2008-05-20 at 06:51 -0600, Brian Barrett wrote:
Terry -

Would you be willing to do an experiment with the memory allocator?
There are two values we change to try to make IB run faster (at the
cost of corner cases you're hitting).  I'm not sure one is strictly
necessary, and I'm concerned that it's the one causing problems.  If
you don't mind recompiling again, would you change line 64 in opal/mca/
memory/ptmalloc2/malloc.c from:

#define DEFAULT_MMAP_THRESHOLD (2*1024*1024)

to:

#define DEFAULT_MMAP_THRESHOLD (128*1024)

And then recompile with the memory manager, obviously.  That will make
the mmap / sbrk cross-over point the same as the default allocator in
Linux.  There's still one other tweak we do, but I'm almost 100%
positive it's the threshold causing problems.


Brian


On May 19, 2008, at 8:17 PM, Terry Frankcombe wrote:

To tell you all what noone wanted to tell me, yes, it does seem to be
the memory manager.  Compiling everything with
--with-memory-manager=none returns the vmem use to the more reasonable
~100MB per process (down from >8GB).

I take it this may affect my peak bandwidth over infiniband.  What's
the
general feeling about how bad this is?


On Tue, 2008-05-13 at 13:12 +1000, Terry Frankcombe wrote:
Hi folks

I'm trying to run an MPI app on an infiniband cluster with OpenMPI
1.2.6.

When run on a single node, this app is grabbing large chunks of
memory
(total per process ~8.5GB, including strace showing a single 4GB
grab)
but not using it.  The resident memory use is ~40MB per process.
When
this app is compiled in serial mode (with conditionals to remove
the MPI
calls) the memory use is more like what you'd expect, 40MB res and
~100MB vmem.

Now I didn't write it so I'm not sure what extra stuff the MPI
version
does, and we haven't tracked down the large memory grabs.

Could it be that this vmem is being grabbed by the OpenMPI memory
manager rather than directly by the app?

Ciao
Terry



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to