# New Ticket Created by  Leopold Toetsch 
# Please include the string:  [perl #17766]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=17766 >


Attached docu is for review and might also go in.
Suggestions, bugfixes, comments welcome.

leo

PS sorry, no MANIFEST update, mine differs too much ...


-- attachment  1 ------------------------------------------------------
url: http://rt.perl.org/rt2/attach/39249/31851/1c9a63/memory_internals_pod.patch

--- parrot/docs/memory_internals.pod    Sat Oct  5 11:19:01 2002
+++ parrot-leo/docs/memory_internals.pod        Sat Oct  5 11:12:48 2002
@@ -0,0 +1,254 @@
+=head1 NAME
+
+Parrot - Memory Internals
+
+=head1 VERSION
+
+0.0.8 Oct 2002
+
+=head1 ABSTRACT
+
+This document tries to explain internals of parrot structures related
+to memory management and should give the answer 42, when questions are
+related to the whole and everything and memory management.
+
+=head1 OVERVIEW
+
+All allocated items are managed in memory pools. A memory pools
+holds collections of similar items. These pools can be roughly divided into
+two kinds: pools holding fixed sized items and variable sized items.
+
+A pool is organized as a linked list of big chunks, holding many managed
+items of one kind.
+
+=head2 Abbrevs used here
+
+DOD ... dead object detection, s. e.g. docs/pdds/pdd09_gc.pod for details.
+
+By scanning through all the interpreters registers, stacks and the processor
+stack, all objects, that are detected here, are marked as alive.
+Objects in all pools not alive are considered dead therefore.
+
+=head1 Top down: the interpreter
+
+A overall layout of interpreters memory management looks like so:
+
+    typedef struct Parrot_Interp {
+       ...
+       struct Arenas *arena_base;
+       ...
+    } Interp;
+
+All object like things, that get allocated during the execution of
+parrot byte code are managed here. Exceptions currently are objects
+allocated early in startup e.g. the names of the pools themselfes.
+
+=head1 Arenas
+
+I<struct Arenas> holds pointer to different subkinds of managed memory, a
+simplification looks similar to this:
+
+    struct Arenas {
+       struct Memory_Pool *memory_pool;
+       ...
+       struct Small_Object_Pool * header_pool;
+    }
+
+I<Memory_pool> and I<header_pool> are variable and fixed sized pool
+pointers respectively.
+
+=head1 Memory_Pool
+
+Here are all variable sized items allocated, managed in linked lists of
+I<Memory_Blocks> - but see below.
+
+=head1 Small_Object_Pool
+
+This pool manages I<Small_Object_Arena>'s, linked together, which provide
+the space for the fixed sized objects.
+
+=head1 Fixed sized items
+
+These items are either objects by them self, like a I<PMC>, or are a
+header structure of a variable sized object like a I<STRING>.
+The general object of this kind is a buffer_like object, which consists
+of a I<Buffer> at the beginning and a variable but fixed sized part after the
+buffer structure. Examples of this are e.g. a I<STRING> or an I<IntList>.
+
+Buffer_like object's of the same size are maintained in the
+I<sized_header_pools>, which manages objects of the same size in one slot.
+
+Actually, with some reorganization, a I<PMC> is not different to the general
+case and will be united in the near future.
+
+All fixed sized objects are allocated with alloc_objects(), and first put
+onto the pool's free_list. When there is need for a new object it's taken
+off the free_list and when it's unused (detected during a DOD run), it
+will be put back on the free list again.
+
+If the free_list is empty, first a DOD run is started, which may put
+dead objects there, or if not, a new pool is allocated to hold more
+objects.
+
+These fixed sized objects are never freed during the lifetime of an
+interpreter, they get just reused or recycled.
+
+=head2 General structure of a buffer_like item
+
+    struct parrot_object_t {
+       struct {
+           void *bufstart;
+           size_t buflen;
+       } b;
+       unsigned flags;
+       ...
+    } parrot_object;
+
+This does not totally reflect the current implementation, but is the
+spirit of the abstraction of current objects. Above structure including
+I<flags> is the current I<Buffer>. With some additional fields like I<strstart>
+and I<bufused>, inserted at the ellipses, you get a I<STRING>.
+
+I<PMC>s will be reworked to fit into this structure.
+
+=head1 Variable sized items
+
+These items never live alone, they are part of a I<Buffer> structure,
+described above. They are allocated at I<bufstart>. This is e.g. used,
+to manage the buffers free list, where I<bufstart> is used as a next object
+pointer.
+
+These items are managed in two different pools: the I<memory_pool> and the
+I<constant_string_pool>. The former holds all variable sized items, while the
+latter containing the word "string", holds constant strings only, as we
+don't have other variable sized constant items to store.
+
+Here, different memory allocation scemes jump in:
+
+=head2 Copying GC
+
+A I<memory_pool> gets allocated in big blocks, namely a I<Memory_Block>.
+
+When some part is needed, e.g. to store a string, this part is carved out of
+the memory block, until this block is used up. If no more space is available in
+this memory block, a garbage collections (GC) is started.  This copies all
+living items of all used memory blocks into one new block, which holds
+thereafter only used items tightly packed together.
+
+The old memory blocks, containing sparse unused parts and used parts already
+copied to the new place, are then unused alltogether and get free()ed
+thereafter.
+
+When GC doesn't provide enough free space, needed for a new item, a new
+block is added to the memory pool.
+
+This also implies, that buffers are moved around during their life. Users of
+these buffers are therefore not allowed to hold pointers to buffers over pieces
+of code, that might trigger a GC run, like Parrot_allocate() or
+string_compare().
+
+=head2 Defragmentating allocator
+
+An alternative to above is, to use a memory allocator, which is as fast
+as above and does resuse free()ed items in a memory conserving way.
+Actually I<libc> provides this, based on Lea's allocator, and parrot
+includes one, for systems not having such a nice feature.
+
+Using this allocator, all variable sized items are just allocated via a
+plain malloc() call, or resized with realloc(), and after they lost
+there existence, this is, when DOD detected, that the managing buffer
+header is not used anymore, they are free()ed. That's all. The
+underlaying allocator collect's this pieces, does coalesce them to bigger
+parts - if possible, keeps them on free lists, sorted by their sizes, and
+when a new malloc() arrives, eventually gives them back to parrot.
+
+So here, the variable length I<memory_pool> is unused. You can consider this
+pool to live inside the allocator.
+
+Buffers allocated this way don't move around, except when reallocated of
+course.
+
+The B<Configure> options I<--gc> allows to use either method.
+
+=head2 Buffer_Tail and COW
+
+Both implementaions have the same problem to solve: I<STRING>s that get
+copied, or parts of strings as the result of a substr() call, do not
+allocate new space in the memory pool to hold this string or part of
+string. They just use a new_string_header() and set up a pointer
+(I<strstart>) pointing somewhere into the original string and remember the
+used length there in I<bufused>.
+
+This is all ok, as long as the original and the lazy copy of the string
+are not modified. So that's well known and called COW (copy on write).
+
+Now, during GC (or freeing buffers) the problem arises: to whome belongs
+this string. You shouldn't copy the same string to different places,
+thus rendering COW to a noop, nor are you allowed to free this same
+string more then once, B<gdb> i.e. your debugger will tell you why.
+
+Both allocation scemes therefore use a I<Buffer_Tail>. After the end of the
+actual variable length string there is some space for a flag, to remember,
+what to do with such strings.
+
+Copying GC marks such COW strings with I<TAIL_moved> and stores the new
+address in the buffer header, so other users of this string can be
+updated to reuse the same string (XXX one or all other users?).
+
+The malloc()/free() approach stores a usage flag there. Only if all
+users of this string are dead, one and only one buffer header using this
+string, is allowed to free this string. And additionally this header is
+not allowed to free the string immediately, then, if so, the I<Buffer_Tail>
+would be gone too. At best, the last - in terms of detecting - user of
+such a dead COW string should free it, but that would cause more
+overhead as it could be useful, so currently this string will be freed
+during the next DOD run.
+
+=head1 Simplified Figure
+
+=item
+
+                             +--------+
+      +------------------<---| Arenas |<-----------+
+      |                      +--------+-->--+      |
+      |                                     |      |
+      |         +------+    +-----------+   |  +=============+
+      |         | S0   |<---| Registers |<--)--| Interpreter |
+      |         +------+    +-----------+   |  +=============+
+      |     +---| S1   |                    |
+      |     |   +------+                    +----------+
+ +-------+  |                                          |
+ | Blk 1 |--)-->+----------+    +--------------+   +---------+
+ +-------+  |   | Buffer 1 |    | OnestringAno |   | Block 1 |
+ | Blk 2 |  |   +----------+    | therstring.. |   +---------+
+ +-------+  |   | Buffer 2 |    | ..String...  |<--| Block 2 |
+ | .     |  |   +----------+    +--------------+   +---------+
+ +-------+  |   | ...      |        ^    ^         | ...     |
+ Small Obj  |   +----------+        |    |         +---------+
+ Pool       +-->| Buffer N |--------+----+         Memory Pool
+                +----------+
+                 Buffer          Memory Block
+
+
+
+Now consider, that the I<Interpreter> shall be a I<PMC> living in
+I<Buffer X> of the underlaying interpreter and is currently running
+B<perldoc docs/memory_internals.pod>, and then redo this figure, with
+all the blanks filled in ;-)
+
+=head1 FILES
+
+smallobject.[ch], headers.c, resources.[ch], res_lea.c, dod.c, string.[ch]
+
+=head1 BUGS
+
+All spelling errors belong to those, who honestly collected them, as well as
+all errors related to my abuse of the English language - I'm not natively
+speaking it.
+
+To minimize this bugs section - please patch this document and keep it
+up to date - thanks.
+
+=head1 AUTHOR
+
+Leopold Tötsch <[EMAIL PROTECTED]>

Reply via email to