Re: More than 2, but less than 3 GiB per process memory?

Bob Proulx Tue, 27 Sep 2005 09:19:12 -0700

Malte Cornils wrote:
> I've done some further research and found out a few things.


I apologize that I have not read your results in detail.  But I did
not want to hold off adding this information until I had and so this
may be overlapping or incomplete with regard to your complete post.

> So, malloc appears to only use contiguos areas of address space with
> a single invocation, but allocating more memory in a different
> "address space hole" the second time around works fine.

I read through the glibc source and here are some of the interesting
comments from the glibc malloc.c implementation.  Just the Reader's
Digest version.  If you skim, make sure you read the section marked
"M_MMAP_THRESHOLD".

/*
  This is not the fastest, most space-conserving, most portable, or
  most tunable malloc ever written. However it is among the fastest
  while also being among the most space-conserving, portable and tunable.
  Consistent balance across these factors results in a good general-purpose
  allocator for malloc-intensive programs.

  The main properties of the algorithms are:
  * For large (>= 512 bytes) requests, it is a pure best-fit allocator,
    with ties normally decided via FIFO (i.e. least recently used).
  * For small (<= 64 bytes by default) requests, it is a caching
    allocator, that maintains pools of quickly recycled chunks.
  * In between, and for combinations of large and small requests, it does
    the best it can trying to meet both goals at once.
  * For very large requests (>= 128KB by default), it relies on system
    memory mapping facilities, if supported.

  For a longer but slightly out of date high-level description, see
     http://gee.cs.oswego.edu/dl/html/malloc.html
*/

Note that 128KB section about using the system mmap.

/*
  Define HAVE_MMAP as true to optionally make malloc() use mmap() to
  allocate very large blocks.  These will be returned to the
  operating system immediately after a free(). Also, if mmap
  is available, it is used as a backup strategy in cases where
  MORECORE fails to provide space from system.

  This malloc is best tuned to work with mmap for large requests.
  If you do not have mmap, operations involving very large chunks (1MB
  or so) may be slower than you'd like.
*/

#ifndef HAVE_MMAP
#define HAVE_MMAP 1

So it definitely wants to use mmap() and it wants to use it for big
chunks.

/*
   MMAP_AS_MORECORE_SIZE is the minimum mmap size argument to use if
   sbrk fails, and mmap is used as a backup (which is done only if
   HAVE_MMAP).  The value must be a multiple of page size.  This
   backup strategy generally applies only when systems have "holes" in
   address space, so sbrk cannot perform contiguous expansion, but
   there is still space available on system.  On systems for which
   this is known to be useful (i.e. most linux kernels), this occurs
   only when programs allocate huge amounts of memory.  Between this,
   and the fact that mmap regions tend to be limited, the size should
   be large, to avoid too many mmap calls and thus avoid running out
   of kernel resources.
*/

#ifndef MMAP_AS_MORECORE_SIZE
#define MMAP_AS_MORECORE_SIZE (1024 * 1024)
#endif

So if sbrk() fails it chunks out of mmap in 1MB chunks.

/* Define USE_ARENAS to enable support for multiple `arenas'.  These
   are allocated using mmap(), are necessary for threads and
   occasionally useful to overcome address space limitations affecting
   sbrk(). */

#ifndef USE_ARENAS
#define USE_ARENAS HAVE_MMAP
#endif

Not really interesting unless we are using threads.  But it explains
the arena values from malloc_stats().

Here is a very interesting section.  Don't miss this.

/*
  M_MMAP_THRESHOLD is the request size threshold for using mmap()
  to service a request. Requests of at least this size that cannot
  be allocated using already-existing space will be serviced via mmap.
  (If enough normal freed space already exists it is used instead.)

  Using mmap segregates relatively large chunks of memory so that
  they can be individually obtained and released from the host
  system. A request serviced through mmap is never reused by any
  other request (at least not directly; the system may just so
  happen to remap successive requests to the same locations).

  Segregating space in this way has the benefits that:

   1. Mmapped space can ALWAYS be individually released back
      to the system, which helps keep the system level memory
      demands of a long-lived program low.
   2. Mapped memory can never become `locked' between
      other chunks, as can happen with normally allocated chunks, which
      means that even trimming via malloc_trim would not release them.
   3. On some systems with "holes" in address spaces, mmap can obtain
      memory that sbrk cannot.

  However, it has the disadvantages that:

   1. The space cannot be reclaimed, consolidated, and then
      used to service later requests, as happens with normal chunks.
   2. It can lead to more wastage because of mmap page alignment
      requirements
   3. It causes malloc performance to be more dependent on host
      system memory management support routines which may vary in
      implementation quality and may impose arbitrary
      limitations. Generally, servicing a request via normal
      malloc steps is faster than going through a system's mmap.

  The advantages of mmap nearly always outweigh disadvantages for
  "large" chunks, but the value of "large" varies across systems.  The
  default is an empirically derived value that works well in most
  systems.
*/

#define M_MMAP_THRESHOLD      -3

#ifndef DEFAULT_MMAP_THRESHOLD
#define DEFAULT_MMAP_THRESHOLD (128 * 1024)
#endif

This means that big requests are those bigger than 128kb and those are
serviced through mmap().  This is a mallopt() configurable parameter.
Smaller requests are serviced through sbrk() until that runs out.
After sbrk() runs out then mmap() is fallen back upon to provide more
memory until that runs out.  This explains the behavior we have seen
with addresses being allocated first in one place and then in the
other.

I added a call to malloc_stats(), basically this in the programs:

          printf("Out of memory at %ld MB\n",count / (1024 / 64));
          fflush(stdout);
          malloc_stats();
          sleep(30);
          exit(0);

For the 1MB malloc() case:

  Arena 0:
  system bytes     =  938486264
  in use bytes     =  938482680
  Total (incl. mmap):
  system bytes     = 3083831800
  in use bytes     = 3083828216
  max mmap regions =       2038
  max mmap bytes   = 2145345536

I modified this for a 64kb malloc() case:

  Arena 0:
  system bytes     = 3085863300
  in use bytes     = 2952052412
  Total (incl. mmap):
  system bytes     = 3085863300
  in use bytes     = 2952052412
  max mmap regions =          0
  max mmap bytes   =          0

This I don't understand.  I thought I did until this point.  This
shows that it never mmap'd any segments.  I was able to get the same
amount of memory (without limits of overhead) from the system by using
smaller chunks.  I expected it to run out of sbrk() at 1GB and fall
back to mmap and allocate the remaining 2GB.  Perhaps this is an
accounting strangeness in malloc_stats() only.  Because it did
allocate the full 3GB of memory regardless and we now know sbrk()
fails after the first 1GB.

> The question remains why the address space is split like it is. That is, the 
> program starts at 0x08000000, while the shared libraries are at 0x4100000. 
> So, approx. 900 MiB can be allocated here. Then, there is free address space 
> again after the shared libraries, up to approx. 0xb7000000. This is the 
> largest contiguous amount of free address space (ca. 2 GiB) and is what 
> limits my malloc test program, I suppose.

I believe this is done for the reasons above in the code comments.  As
a rough heuristic so that memory can be grouped into small chunks and
big chunks and coalesced more readily.

> Anyone still following me? And should I try asking at glibc-devel or at the 
> Linux kernel mailing list?

You are in pretty deep into the glibc malloc implementation.  At a
guess I would expect the glibc list would be more interested.  (But I
am interested and I am not subscribed there. :-)

By the way, thanks for the good literature search and providing all of
those references!

Bob

signature.asc
Description: Digital signature

Re: More than 2, but less than 3 GiB per process memory?

Reply via email to