> Here are some timings on my system with your basic stats patch: > These results are taken when the first command input is expected, > having keyed-ahead the N to avoid delays.
Technically, the patch I gave you doesn't count the delay waiting for user input. But there are other things to discuss.... :) > CVS + COW: (using your original cow patch) > Pure CVS : > New 'grey' (basically the same as CVS plus the grey1 patch) Here are my results comparing the above three parrots, with my latest and greatest COW patch, attached to this email. parrot-cvs parrot-cow parrot-grey gc_alloc_new.pbc 1.16 1.25 1.27 gc_alloc_reuse.pbc 5.77 2.98 2.98 gc_generations.pbc 2.05 2.49 1.79 gc_header_new.pbc 1.11 1.14 1.29 gc_header_reuse.pbc 1.60 1.46 1.21 gc_waves_headers.pbc 0.29 0.26 0.34 gc_waves_sizeable_data.pbc 8.05 1.59 1.65 gc_waves_sizeable_headers.pbc 14.86 2.38 1.99 life (500 generations) 6.89 6.35 5.56 hanoi 15 3.92 4.08 3.85 wumpus map creation 1.46 0.52 0.56 wumpus map creation (MB mem) 2.00 1.70 2.60 All times are in seconds, except for the last, which is mem usage in MB. Comparing parrot-cow against parrot-cvs and parrot-grey directly as ratios, we get: (Lower is better here. 1.00 is baseline.) cow vs cvs cow vs grey gc_alloc_new.pbc 1.08 0.98 gc_alloc_reuse.pbc 0.52 1.00 gc_generations.pbc 1.21 1.39 gc_header_new.pbc 1.03 0.88 gc_header_reuse.pbc 0.91 1.21 gc_waves_headers.pbc 0.89 0.76 gc_waves_sizeable_data.pbc 0.20 0.96 gc_waves_sizeable_headers.pbc 0.16 1.20 life (500 generations) 0.92 1.14 hanoi 15 1.04 1.06 wumpus map creation 0.36 0.93 wumpus map creation (peak mem usage in MB) 0.85 0.65 In general, cow and grey both do well on a couple of the gc benchmarks due to the use of cow. Note that the comparison against parrot-grey is not exactly fair, because it dodn't use system stackwalking. I'm including it here because grey also uses COW, and so it is semi-meaningful. :) The fact that I'm not blown away means good things, I think. Taking the above 12 numbers, and summing them together, and dividing by 12 (a poor man's aggregate statistic), we get: cow vs cvs: 0.76 cow vs grey: 1.01 > I am rather concerned about the total Buffer structs numbers: > your cow version allocates 4x as many as cvs. It doesn't seem > to be anything that should be affected by the cow logic. Does it > get better with your reclaimable changes? From my previous > benchmarks I remember that DOD is one of our most expensive > operations, and that is dependent on the number of allocated > objects. Yes, me too. I finally tracked down the source of this, and it just goes to show my anti-benchmark-isms on minor stuff, since changing magic numbers can have much larger effects on performance. Header bloat came from the high REPLENISH_LEVEL_FACTOR, which was set at 0.8. Which basically meant that if we ever fell below 80% free, we need to allocate more buffer headers. Changing this to 0.2 magically solved most of my COW problems. Instead of the 80K total buffers used before, or the 20K buffers used in CVS, I now have 2K total buffers. Granted, this almost seems like a cop-out way of not solving the real problem. However, there is a real reason at work here. We perform DOD in two situations: a) when we need more memory for an allocation, and we don't have enough in our memory pool, we try dod'ing so we can free up headers, and free up reclaimable memory. b) when we run out of headers, we perform DOD and check against REPLENISH_LEVEL_FACTOR. Because COW implementations save memory, and in particular, my COW implementation does COW with constant buffers as well, (a) is run left often. This leaves choice (b) to occur more frequently, resulting in the mis-application of REPLENISH_LEVEL_FACTOR. The fact that COW improved things was greatly overshadowed by a broken constant. Wumpus loading performs quite well because the BASIC parser uses substr to parse and figure out how to work with strings. It's the fact that these are all using COW-substrings which is buying us the greatest memory usage increase, and thus this was the test that most exposed the problems mentioned above. Thus, my belief that refactorings are perfectly fine to do on the GC, even if they get some percentage loss in performance. Their ability to make the code easier to understand, and ultimately, easier to add dynamic tuning, generational-ness, and other such optimizations, outweighs everything, imo. > Note that I get half the memory usage with grey that you do, > even though we should be running the same code; but it is > still double current cvs. Out of interest, try adding a sweep and > collect in instructions.pasm at label getout, so the reported > active buffers and memory use are as accurate as we can > make them. Which makes me think of something - grey is Yes, this solved some of the problem. Adding a sweep&collect dropped the active header usage down to reasonable levels, which helped me realize that I wasn't actually allocating any additional headers due to COW...the GC was just inefficient in it's management of them. > ignoring reclaimable to get the size to allocate for the > post-compaction pool, therefore the memory usage is always > going to be higher than is actually needed - are we simply > looking at excess allocation here, rather than excess usage? > If so, grey will fix it in the next release with paged memory > allocation; and I'm sure you'll think of a solution also. That's also a distinct possibility. The current COW implementation ignores reclaimable altogether, piecing together a proper total_size using the code I posted in a previous email. Aaaaaaaaaaanywaaaaaaaays, >From my current benchmarks, a lot of my worries about COW have been nullified. BASIC wumpus loading now takes place in 1/3 the time, and the worst-case performance is gc_generations of 20%. And that test exclusively uses "repeat" to create lots of strings of varying lifetimes, so it's unreasonable to expect any better performance on it. So, now that the major objections to the previous patch have been addressed, does anyone have any reasons against this patch going in? Thanks, Mike Lambert
Index: core.ops =================================================================== RCS file: /cvs/public/parrot/core.ops,v retrieving revision 1.199 diff -u -r1.199 core.ops --- core.ops 18 Aug 2002 23:57:37 -0000 1.199 +++ core.ops 19 Aug 2002 02:28:35 -0000 @@ -166,9 +166,9 @@ } $1 = string_make(interpreter, NULL, 65535, NULL, 0, NULL); - memset(($1)->bufstart, 0, 65535); - fgets(($1)->bufstart, 65534, file); - ($1)->strlen = ($1)->bufused = strlen(($1)->bufstart); + memset(($1)->strstart, 0, 65535); + fgets(($1)->strstart, 65534, file); + ($1)->strlen = ($1)->bufused = strlen(($1)->strstart); goto NEXT(); } @@ -354,7 +354,7 @@ UINTVAL len = $3; s = string_make(interpreter, NULL, len, NULL, 0, NULL); - read($2, s->bufstart, len); + read($2, s->strstart, len); s->bufused = len; $1 = s; goto NEXT(); @@ -418,7 +418,7 @@ op write(in INT, in STR) { STRING * s = $2; UINTVAL count = string_length(s); - write($1, s->bufstart, count); + write($1, s->strstart, count); goto NEXT(); } @@ -2256,7 +2256,7 @@ t = string_make(interpreter, buf, (UINTVAL)(len - s->buflen), NULL, 0, NULL); $1 = string_concat(interpreter, $1, s, 1); } else { - t = string_make(interpreter, s->bufstart, (UINTVAL)len, NULL, 0, NULL); + t = string_make(interpreter, s->strstart, (UINTVAL)len, NULL, 0, NULL); } $1 = string_concat(interpreter, $1, t, 1); @@ -2281,7 +2281,7 @@ } /* XXX this is EVIL, use string_replace */ - n = $1->bufstart; + n = $1->strstart; t = string_to_cstring(interpreter, s); for (i = $4; i < $4 + $2; i++) n[i] = t[i - $4]; @@ -3891,7 +3891,7 @@ switch ($3) { case STRINGINFO_HEADER: $1 = PTR2UINTVAL($2); break; - case STRINGINFO_BUFSTART: $1 = PTR2UINTVAL($2->bufstart); + case STRINGINFO_STRSTART: $1 = PTR2UINTVAL($2->strstart); break; case STRINGINFO_BUFLEN: $1 = $2->buflen; break; @@ -4162,13 +4162,13 @@ void (*func)(void); string_to_cstring(interpreter, ($2)); string_to_cstring(interpreter, ($1)); - p = Parrot_dlopen($1->bufstart); + p = Parrot_dlopen($1->strstart); if(p == NULL) { const char * err = Parrot_dlerror(); fprintf(stderr, "%s\n", err); PANIC("Failed to load native library"); } - func = D2FPTR(Parrot_dlsym(p, $2->bufstart)); + func = D2FPTR(Parrot_dlsym(p, $2->strstart)); if (NULL == func) { PANIC("Failed to find symbol in native library"); } Index: debug.c =================================================================== RCS file: /cvs/public/parrot/debug.c,v retrieving revision 1.25 diff -u -r1.25 debug.c --- debug.c 18 Aug 2002 23:57:37 -0000 1.25 +++ debug.c 19 Aug 2002 02:28:37 -0000 @@ -692,7 +692,7 @@ constants[pc[j]]->string->strlen) { escaped = PDB_escape(interpreter->code->const_table-> - constants[pc[j]]->string->bufstart, + constants[pc[j]]->string->strstart, interpreter->code->const_table-> constants[pc[j]]->string->strlen); if (escaped) Index: dod.c =================================================================== RCS file: /cvs/public/parrot/dod.c,v retrieving revision 1.17 diff -u -r1.17 dod.c --- dod.c 18 Aug 2002 23:57:37 -0000 1.17 +++ dod.c 19 Aug 2002 02:28:37 -0000 @@ -293,9 +293,8 @@ interpreter->arena_base->pmc_pool->total_objects - total_used; } -/* Put any free buffers that aren't on the free list on the free list - * Free means: not 'live' and not immune - * Temporary immunity is also granted to newborns */ +/* Put any buffers that are now unused, on to the free list + * Avoid buffers that are immune from collection (ie, constant) */ static void free_unused_buffers(struct Parrot_Interp *interpreter, struct Small_Object_Pool *pool) @@ -310,13 +309,16 @@ cur_arena = cur_arena->prev) { Buffer *b = cur_arena->start_objects; for (i = 0; i < cur_arena->used; i++) { - /* If it's not live or on the free list, put it on the free list */ - if (!(b->flags & (BUFFER_live_FLAG | BUFFER_on_free_list_FLAG)) && - (!(b->flags & BUFFER_constant_FLAG) || - (b->flags & BUFFER_COW_FLAG))) + /* If this thing is not live and not dead yet, make it dead now. */ + if (!(b->flags & ( BUFFER_on_free_list_FLAG + | BUFFER_constant_FLAG + | BUFFER_live_FLAG ))) { if (pool->mem_pool) { - ((struct Memory_Pool *)pool->mem_pool)->reclaimable += b->buflen; + if (!(b->flags & BUFFER_COW_FLAG)) { + ((struct Memory_Pool +*)pool->mem_pool)->guaranteed_reclaimable += b->buflen; + } + ((struct Memory_Pool *)pool->mem_pool)->possibly_reclaimable += +b->buflen; } add_free_buffer(interpreter, pool, b); } else if (!(b->flags & BUFFER_on_free_list_FLAG)) { Index: hash.c =================================================================== RCS file: /cvs/public/parrot/hash.c,v retrieving revision 1.22 diff -u -r1.22 hash.c --- hash.c 18 Aug 2002 23:57:37 -0000 1.22 +++ hash.c 19 Aug 2002 02:28:38 -0000 @@ -81,9 +81,9 @@ static INTVAL key_hash(Interp *interpreter, STRING *value) { - char *buffptr = value->bufstart; - UINTVAL len = value->bufused; - register UINTVAL hash = 5381; + char *buffptr = value->strstart; + INTVAL len = value->strlen; + register INTVAL hash = 5381; UNUSED(interpreter); Index: headers.c =================================================================== RCS file: /cvs/public/parrot/headers.c,v retrieving revision 1.7 diff -u -r1.7 headers.c --- headers.c 12 Aug 2002 06:55:02 -0000 1.7 +++ headers.c 19 Aug 2002 02:28:38 -0000 @@ -100,7 +100,7 @@ /* Don't let it point to garbage memory */ buffer->bufstart = NULL; - buffer->flags = 0; + buffer->flags = BUFFER_selfpoolptr_FLAG; return buffer; } @@ -120,8 +120,11 @@ struct Small_Object_Pool * new_pmc_pool(struct Parrot_Interp *interpreter) { + STRING *name = string_make(NULL, "PMC Pool", + strlen("PMC Pool"), 0, 0, 0); struct Small_Object_Pool *pmc_pool = new_small_object_pool( - interpreter, sizeof(PMC), PMC_HEADERS_PER_ALLOC); + interpreter, sizeof(PMC), PMC_HEADERS_PER_ALLOC, name); +add_extra_buffer_header(interpreter,name); pmc_pool->add_free_object = add_free_pmc; pmc_pool->get_free_object = get_free_pmc; pmc_pool->alloc_objects = alloc_pmcs; @@ -136,11 +139,14 @@ new_bufferlike_pool(struct Parrot_Interp *interpreter, size_t actual_buffer_size) { + STRING *name = string_make(NULL, "Buffer-like Pool", + strlen("Buffer-like Pool"), 0, 0, 0); size_t buffer_size = (actual_buffer_size + sizeof(void*) - 1) & ~(sizeof(void*) - 1); struct Small_Object_Pool *pool = new_small_object_pool(interpreter, buffer_size, - BUFFER_HEADERS_PER_ALLOC); + BUFFER_HEADERS_PER_ALLOC, name); +add_extra_buffer_header(interpreter,name); pool->add_free_object = add_free_buffer; pool->get_free_object = get_free_buffer; pool->alloc_objects = alloc_buffers; @@ -155,6 +161,10 @@ { struct Small_Object_Pool *pool = new_bufferlike_pool(interpreter, sizeof(Buffer)); + STRING *name = string_make(NULL, "Buffer Pool", + strlen("Buffer Pool"), 0, 0, 0); +add_extra_buffer_header(interpreter,name); + pool->name = name; return pool; } @@ -163,6 +173,10 @@ { struct Small_Object_Pool *pool = new_bufferlike_pool(interpreter, sizeof(STRING)); + STRING *name = string_make(NULL, "String Pool", + strlen("String Pool"), 0, 0, 0); +add_extra_buffer_header(interpreter,name); + pool->name = name; pool->objects_per_alloc = STRING_HEADERS_PER_ALLOC; pool->align_1 = STRING_ALIGNMENT-1; if (constant) { @@ -233,7 +247,8 @@ ? interpreter->arena_base->constant_string_header_pool : interpreter->arena_base->string_header_pool ); - string->flags |= flags; + string->flags |= flags | BUFFER_strstart_FLAG; + string->strstart = 0; return string; } Index: interpreter.c =================================================================== RCS file: /cvs/public/parrot/interpreter.c,v retrieving revision 1.96 diff -u -r1.96 interpreter.c --- interpreter.c 17 Aug 2002 01:11:08 -0000 1.96 +++ interpreter.c 19 Aug 2002 02:28:39 -0000 @@ -43,7 +43,7 @@ const char *fp_data; INTVAL fp_len; - fp_data = PCONST(0)->string->bufstart; + fp_data = PCONST(0)->string->strstart; fp_len = PCONST(0)->string->buflen; if (strncmp(OPCODE_FINGERPRINT, fp_data, fp_len)) { Index: io.ops =================================================================== RCS file: /cvs/public/parrot/io.ops,v retrieving revision 1.11 diff -u -r1.11 io.ops --- io.ops 18 Aug 2002 23:57:37 -0000 1.11 +++ io.ops 19 Aug 2002 02:28:39 -0000 @@ -109,7 +109,7 @@ ParrotIO * io; io = (ParrotIO*)($1->data); if ($2 && io) { - PIO_write(interpreter, io, ($2)->bufstart, string_length($2)); + PIO_write(interpreter, io, ($2)->strstart, string_length($2)); } goto NEXT(); } @@ -124,7 +124,7 @@ op printerr(in STR) { if ($1) { - PIO_write(interpreter, PIO_STDERR(interpreter), ($1)->bufstart, + PIO_write(interpreter, PIO_STDERR(interpreter), ($1)->strstart, string_length($1)); } goto NEXT(); @@ -147,7 +147,7 @@ op puts(in STR) { if (($1) && string_length($1)) { - PIO_write(interpreter, PIO_STDOUT(interpreter), ($1)->bufstart, + PIO_write(interpreter, PIO_STDOUT(interpreter), ($1)->strstart, string_length($1)); } goto NEXT(); @@ -156,7 +156,7 @@ op puts(in INT) { STRING * s = string_from_int(interpreter, $1); if (string_length(s)) { - PIO_write(interpreter, PIO_STDOUT(interpreter), s->bufstart, + PIO_write(interpreter, PIO_STDOUT(interpreter), s->strstart, string_length(s)); } goto NEXT(); @@ -165,7 +165,7 @@ op puts(in NUM) { STRING * s = Parrot_sprintf_c(interpreter, "%Vf", $1); if (string_length(s)) { - PIO_write(interpreter, PIO_STDOUT(interpreter), s->bufstart, + PIO_write(interpreter, PIO_STDOUT(interpreter), s->strstart, string_length(s)); } goto NEXT(); @@ -197,8 +197,8 @@ else n = $2; $1 = string_make(interpreter, NULL, n, NULL, 0, NULL); - memset(($1)->bufstart, 0, n); - nr = PIO_read(interpreter, PIO_STDIN(interpreter), ($1)->bufstart, (size_t)n); + memset(($1)->strstart, 0, n); + nr = PIO_read(interpreter, PIO_STDIN(interpreter), ($1)->strstart, (size_t)n); if(nr > 0) ($1)->strlen = ($1)->bufused = nr; else @@ -214,8 +214,8 @@ else n = $3; $1 = string_make(interpreter, NULL, n, NULL, 0, NULL); - memset(($1)->bufstart, 0, n); - nr = PIO_read(interpreter, (ParrotIO*)($2->data), ($1)->bufstart, (size_t)n); + memset(($1)->strstart, 0, n); + nr = PIO_read(interpreter, (ParrotIO*)($2->data), ($1)->strstart, (size_t)n); if(nr > 0) ($1)->strlen = ($1)->bufused = nr; else Index: misc.c =================================================================== RCS file: /cvs/public/parrot/misc.c,v retrieving revision 1.22 diff -u -r1.22 misc.c --- misc.c 11 Jun 2002 20:11:51 -0000 1.22 +++ misc.c 19 Aug 2002 02:28:40 -0000 @@ -495,7 +495,7 @@ STRING *ret = Parrot_vsprintf_c(interpreter, pat, args); /* string_transcode(interpreter, ret, NULL, NULL, &ret);*/ - memcpy(targ, ret->bufstart, ret->bufused); + memcpy(targ, ret->strstart, ret->bufused); targ[ret->bufused + 1] = 00; } @@ -510,7 +510,7 @@ len = ret->bufused; } - memcpy(targ, ret->bufstart, len); + memcpy(targ, ret->strstart, len); targ[len + 1] = 0; } Index: packdump.c =================================================================== RCS file: /cvs/public/parrot/packdump.c,v retrieving revision 1.2 diff -u -r1.2 packdump.c --- packdump.c 17 Mar 2002 06:44:41 -0000 1.2 +++ packdump.c 19 Aug 2002 02:28:40 -0000 @@ -95,7 +95,7 @@ printf(" SIZE => %ld,\n", (long)self->string->bufused); /* TODO: Won't do anything reasonable for most encodings */ printf(" DATA => '%.*s'\n", - (int)self->string->bufused, (char *)self->string->bufstart); + (int)self->string->bufused, (char *)self->string->strstart); printf(" } ],\n"); break; Index: packout.c =================================================================== RCS file: /cvs/public/parrot/packout.c,v retrieving revision 1.9 diff -u -r1.9 packout.c --- packout.c 23 Jul 2002 02:09:27 -0000 1.9 +++ packout.c 19 Aug 2002 02:28:40 -0000 @@ -276,8 +276,8 @@ * characters to ensure padding. */ charcursor = (char *)cursor; - if (self->string->bufstart) { - mem_sys_memcopy(charcursor, self->string->bufstart, + if (self->string->strstart) { + mem_sys_memcopy(charcursor, self->string->strstart, self->string->bufused); charcursor += self->string->bufused; Index: resources.c =================================================================== RCS file: /cvs/public/parrot/resources.c,v retrieving revision 1.80 diff -u -r1.80 resources.c --- resources.c 18 Aug 2002 23:57:37 -0000 1.80 +++ resources.c 19 Aug 2002 02:28:41 -0000 @@ -41,7 +41,7 @@ /* Allocate a new block. Header info's on the front, plus a fudge * factor for good measure */ - new_block = mem_sys_allocate(sizeof(struct Memory_Block) + + new_block = mem_sys_allocate(sizeof(struct Memory_Block) + alloc_size + 32); if (!new_block) { return NULL; @@ -58,32 +58,42 @@ interpreter->memory_allocated += alloc_size; /* If this is for a public pool, add it to the list */ - if (pool) { - new_block->prev = pool->top_block; - /* If we're not first, then tack us on the list */ - if (pool->top_block) { - pool->top_block->next = new_block; - } - pool->top_block = new_block; - pool->total_allocated += alloc_size; + new_block->prev = pool->top_block; + /* If we're not first, then tack us on the list */ + if (pool->top_block) { + pool->top_block->next = new_block; } + pool->top_block = new_block; + pool->total_allocated += alloc_size; + return new_block; } +/* Allocates memory for headers */ static void * mem_allocate(struct Parrot_Interp *interpreter, size_t *req_size, struct Memory_Pool *pool, size_t align_1) { char *return_val; size_t size = *req_size; + + /* Ensure that our minimum size requirements are met, + * so that we have room for a forwarding COW pointer */ + if( size < sizeof(void*) ) + size = sizeof(void*); + + /* Make sure we have room for the buffer's tail flags, + * also used by the COW logic to detect moved buffers */ + size += sizeof(struct Buffer_Tail); + + /* Round up to requested alignment */ + size = (size + align_1) & ~align_1; + if (NULL == interpreter) { void *mem = mem_sys_allocate(size); return mem; } - /* Round up to requested alignment */ - size = (size + align_1) & ~align_1; - /* If not enough room, try to find some */ if (pool->top_block == NULL) { alloc_new_block(interpreter, size, pool); @@ -99,13 +109,13 @@ /* Compact the pool if allowed and worthwhile */ if (pool->compact) { /* don't bother reclaiming if its just chicken feed */ - if (pool->reclaimable > - (size_t)(pool->total_allocated * pool->reclaim_factor) + if ((pool->possibly_reclaimable + pool->guaranteed_reclaimable) / 2 > + (size_t)(pool->total_allocated * pool->reclaim_factor) /* don't bother reclaiming if it won't even be enough */ - && (pool->reclaimable > size) + && (pool->guaranteed_reclaimable > size) ) { - (*pool->compact)(interpreter, pool); + (*pool->compact)(interpreter, pool); } else { Parrot_do_dod_run(interpreter); @@ -124,7 +134,8 @@ return_val = pool->top_block->top; pool->top_block->top += size; pool->top_block->free -= size; - *req_size = size; + *req_size = size - sizeof(struct Buffer_Tail); + ((struct Buffer_Tail*)((char *)return_val + size - 1))->flags = 0; return (void *)return_val; } @@ -158,10 +169,17 @@ /* total-reclaimable == currently used. Add a minimum block to the * current amount, so we can avoid having to allocate it in the * future. */ - total_size = pool->total_allocated - pool->reclaimable + pool->minimum_block_size; - /* total_size = pool->total_allocated; */ - /* TODO: can reduce this by pool->total_reclaimable if we want to - * be precise */ + { + struct Memory_Block *cur_block, *next_block; + total_size = 0; + cur_block = pool->top_block; + while (cur_block) { + total_size += cur_block->size - cur_block->free; + cur_block = cur_block->prev; + } + } + total_size += pool->minimum_block_size; + /* Snag a block big enough for everything */ new_block = alloc_new_block(interpreter, total_size, pool); @@ -184,25 +202,70 @@ Buffer *b = cur_buffer_arena->start_objects; UINTVAL i; for (i = 0; i < cur_buffer_arena->used; i++) { - if (b->bufstart) { - /* Is the buffer live, and can we move it? */ - if (!(b->flags & (BUFFER_on_free_list_FLAG | - BUFFER_constant_FLAG | - BUFFER_immobile_FLAG))) - { - memcpy(cur_spot, b->bufstart, b->buflen); + if (b->bufstart && + !(b->flags & ( BUFFER_on_free_list_FLAG + | BUFFER_constant_FLAG + | BUFFER_immobile_FLAG + | BUFFER_external_FLAG + ))) { + struct + Buffer_Tail *tail = + (struct Buffer_Tail *)((char *)b->bufstart + b->buflen); + ptrdiff_t offset = (ptrdiff_t)((STRING*)b)->strstart - +(ptrdiff_t)b->bufstart; + /* buffer has already been moved; just change the header */ + if (b->flags & BUFFER_COW_FLAG + && tail->flags & TAIL_moved_FLAG) { + /* Find out who else references our data */ + Buffer* hdr = *(Buffer**)(b->bufstart); + /* Make sure they know that we own it too */ + hdr->flags |= BUFFER_COW_FLAG; + /* Now make sure we point to where the other guy does */ + b->bufstart = hdr->bufstart; + /* And if we're a string, update strstart */ + /* Somewhat of a hack, but if we get per-pool collections, + * it should help ease the pain */ + if (b->flags & BUFFER_strstart_FLAG) { + ((STRING*)b)->strstart = (char *)b->bufstart + offset; + } + } + else + if (b->flags & BUFFER_selfpoolptr_FLAG) { + struct Buffer_Tail *new_tail = + (struct Buffer_Tail *)((char *)cur_spot + b->buflen); + /* Copy our memory to the new pool */ + memcpy(cur_spot, b->bufstart, + b->buflen); + new_tail = 0; + /* If we're COW */ + if (b->flags & BUFFER_COW_FLAG) { + /* Let the old buffer know how to find us */ + *(Buffer**)(b->bufstart) = b; + /* No guaranatees that our data is still COW, + * so assume not, and let the above code fix-up */ + b->flags &= ~BUFFER_COW_FLAG; + /* Finally, let the tail know that we've moved, + * so that any other references can know to look + * for us and not re-copy */ + tail->flags |= TAIL_moved_FLAG; + } b->bufstart = cur_spot; - cur_size = b->buflen; + if (b->flags & BUFFER_strstart_FLAG) { + ((STRING*)b)->strstart = (char *)b->bufstart + offset; + } + cur_size = b->buflen + sizeof(struct Buffer_Tail); cur_size = (cur_size + header_pool->align_1) & ~header_pool->align_1; cur_spot += cur_size; } } - b = (Buffer *)((char *)b + object_size); + b = (Buffer *)((char*)b + object_size); } } } /* Run through all the out-of-band Buffer header pools and copy */ + /* This code ignores COW, for now. This essentially means that if + * any other buffers COW-reference data with the buffers below, + * that data will get duplicated during this collection run. */ for (j = 0; j < (INTVAL)( interpreter->arena_base->extra_buffer_headers.buflen / sizeof(Buffer*) ); j++) { Buffer** buffers = interpreter->arena_base->extra_buffer_headers.bufstart; Buffer* b = buffers[j]; @@ -211,11 +274,18 @@ BUFFER_constant_FLAG | BUFFER_immobile_FLAG))) { + struct Buffer_Tail *new_tail = + (struct Buffer_Tail *)((char *)cur_spot + b->buflen); + UINTVAL offset = (ptrdiff_t)((STRING*)b)->strstart - +(ptrdiff_t)b->bufstart; memcpy(cur_spot, b->bufstart, b->buflen); + new_tail->flags = 0; b->bufstart = cur_spot; cur_size = b->buflen; cur_size = (cur_size + BUFFER_ALIGNMENT - 1) & ~(BUFFER_ALIGNMENT - 1); cur_spot += cur_size; + if (b->flags & BUFFER_strstart_FLAG) { + ((STRING*)b)->strstart = (char *)b->bufstart + offset; + } } } } @@ -251,9 +321,10 @@ /* Set our new pool as the only pool */ new_block->prev = NULL; pool->total_allocated = total_size; - pool->reclaimable = 0; } + pool->guaranteed_reclaimable = 0; + pool->possibly_reclaimable = 0; interpreter->GC_block_level--; } @@ -285,7 +356,11 @@ buffer = from; copysize = (buffer->buflen > tosize ? tosize : buffer->buflen); if (interpreter) { - interpreter->arena_base->memory_pool->reclaimable += + if (!(buffer->flags & BUFFER_COW_FLAG)) { + interpreter->arena_base->memory_pool->guaranteed_reclaimable += + buffer->buflen; + } + interpreter->arena_base->memory_pool->possibly_reclaimable += buffer->buflen; mem = mem_allocate(interpreter, &alloc_size, interpreter->arena_base->memory_pool, BUFFER_ALIGNMENT-1); @@ -323,7 +398,10 @@ pool = (str->flags & BUFFER_constant_FLAG) ? interpreter->arena_base->constant_string_pool : interpreter->arena_base->memory_pool; - pool->reclaimable += str->buflen; + if (!(str->flags & BUFFER_COW_FLAG)) { + pool->guaranteed_reclaimable += str->buflen; + } + pool->possibly_reclaimable += str->buflen; mem = mem_allocate(interpreter, &alloc_size, pool, STRING_ALIGNMENT-1); if (!mem) { @@ -333,10 +411,11 @@ * track down those bugs, this can be removed which would make * things cheaper */ if (copysize) { - memcpy(mem, str->bufstart, copysize); + memcpy(mem, str->bufstart, copysize); } str->bufstart = mem; str->buflen = alloc_size; + str->strstart = str->bufstart; return mem; } @@ -364,6 +443,7 @@ str->buflen = 0; str->bufstart = NULL; + str->strstart = NULL; if (!interpreter) { str->bufstart = mem_allocate(NULL, &req_size, NULL, STRING_ALIGNMENT-1); @@ -375,6 +455,7 @@ str->bufstart = mem_allocate(interpreter, &req_size, pool, STRING_ALIGNMENT-1); } str->buflen = req_size; + str->strstart = str->bufstart; return str; } @@ -394,7 +475,8 @@ pool->compact = compact; pool->minimum_block_size = min_block; pool->total_allocated = 0; - pool->reclaimable = 0; + pool->guaranteed_reclaimable = 0; + pool->possibly_reclaimable = 0; pool->reclaim_factor = RECLAMATION_FACTOR; } return pool; Index: smallobject.c =================================================================== RCS file: /cvs/public/parrot/smallobject.c,v retrieving revision 1.9 diff -u -r1.9 smallobject.c --- smallobject.c 12 Aug 2002 06:55:02 -0000 1.9 +++ smallobject.c 19 Aug 2002 02:28:41 -0000 @@ -18,7 +18,7 @@ # define REPLENISH_LEVEL_FACTOR 0.0 # define UNITS_PER_ALLOC_GROWTH_FACTOR 1 #else -# define REPLENISH_LEVEL_FACTOR 0.8 +# define REPLENISH_LEVEL_FACTOR 0.2 # define UNITS_PER_ALLOC_GROWTH_FACTOR 4 #endif @@ -143,7 +143,7 @@ struct Small_Object_Pool * new_small_object_pool(struct Parrot_Interp *interpreter, - size_t object_size, size_t objects_per_alloc) + size_t object_size, size_t objects_per_alloc, STRING *name) { struct Small_Object_Pool *pool; @@ -159,6 +159,8 @@ pool->get_free_object = get_free_object; pool->alloc_objects = alloc_objects; pool->mem_pool = NULL; + pool->name = name; + pool->name->flags |= BUFFER_constant_FLAG; return pool; } Index: string.c =================================================================== RCS file: /cvs/public/parrot/string.c,v retrieving revision 1.84 diff -u -r1.84 string.c --- string.c 18 Aug 2002 23:57:37 -0000 1.84 +++ string.c 19 Aug 2002 02:28:42 -0000 @@ -17,6 +17,52 @@ #define EXTRA_SIZE 4 +/* String COW support */ +static void +unmake_COW(struct Parrot_Interp *interpreter, STRING *s) +{ +#if 0 + if (s->flags & BUFFER_constant_FLAG) { + /* this happens when we call string_to_cstring on + * a constant string in order to print it + */ + internal_exception(INVALID_OPERATION, + "Cannot unmake COW on a constant header"); + } + else +#endif + if (s->flags & (BUFFER_COW_FLAG|BUFFER_constant_FLAG)) { + s->bufstart = s->strstart; + s->buflen = s->strlen; + /* Create new pool data for this header to use, + * independant of the original COW data */ + Parrot_reallocate_string(interpreter, s, s->buflen); + s->flags &= ~(UINTVAL)(BUFFER_COW_FLAG | BUFFER_constant_FLAG); + } +} + +/* clone a string header without allocating a new buffer + * i.e. create a 'copy-on-write' string + */ +static STRING * +make_COW_reference(struct Parrot_Interp *interpreter, STRING *s) +{ + STRING *d; + if (s->flags & BUFFER_constant_FLAG) { + d = new_string_header(interpreter, + s->flags & ~(UINTVAL)BUFFER_constant_FLAG); + s->flags |= BUFFER_COW_FLAG; + memcpy(d, s, sizeof (STRING)); + d->flags &= ~(UINTVAL)(BUFFER_constant_FLAG|BUFFER_selfpoolptr_FLAG); + } + else { + d = new_string_header(interpreter, s->flags); + s->flags |= BUFFER_COW_FLAG; + memcpy(d, s, sizeof (STRING)); + } + return d; +} + /* Basic string stuff - creation, enlargement, destruction, etc. */ /*=for api string string_init @@ -61,9 +107,11 @@ a = string_grow(interpreter, a, ((a->bufused + b->bufused) - a->buflen) + EXTRA_SIZE); } + unmake_COW(interpreter, a); + /* Tack B on the end of A */ - mem_sys_memcopy((void *)((ptrcast_t)a->bufstart + a->bufused), - b->bufstart, b->bufused); + mem_sys_memcopy((void *)((ptrcast_t)a->strstart + a->bufused), + b->strstart, b->bufused); a->bufused += b->bufused; a->strlen += b->strlen; return a; @@ -86,7 +134,7 @@ a->encoding = b->encoding; a->type = b->type; a->language = b->language; - memcpy(a->bufstart, b->bufstart, b->bufused); + memcpy(a->strstart, b->strstart, b->bufused); return a; } } @@ -120,7 +168,7 @@ s->type = type; if (buffer) { - mem_sys_memcopy(s->bufstart, buffer, buflen); + mem_sys_memcopy(s->strstart, buffer, buflen); s->bufused = buflen; (void)string_compute_strlen(s); } @@ -136,6 +184,7 @@ */ STRING * string_grow(struct Parrot_Interp * interpreter, STRING * s, INTVAL addlen) { + unmake_COW(interpreter, s); /* Don't check buflen, if we are here, we already checked. */ Parrot_reallocate_string(interpreter, s, s->buflen + addlen); return s; @@ -170,7 +219,7 @@ INTVAL string_index(const STRING *s, UINTVAL idx) { - return s->encoding->decode(s->encoding->skip_forward(s->bufstart, idx)); + return s->encoding->decode(s->encoding->skip_forward(s->strstart, idx)); } /*=for api string string_ord @@ -218,21 +267,9 @@ * create a copy of the argument passed in */ STRING * -string_copy(struct Parrot_Interp *interpreter, const STRING *s) +string_copy(struct Parrot_Interp *interpreter, STRING *s) { - STRING *d; - d = new_string_header(interpreter, - s->flags & ~(UINTVAL)BUFFER_constant_FLAG); - Parrot_allocate_string(interpreter, d, s->buflen); - d->bufused = s->bufused; - d->strlen = s->strlen; - d->encoding = s->encoding; - d->type = s->type; - d->language = s->language; - - memcpy(d->bufstart, s->bufstart, s->buflen); - - return d; + return make_COW_reference(interpreter, s); } /*=for api string string_transcode @@ -240,7 +277,7 @@ */ STRING * string_transcode(struct Parrot_Interp *interpreter, - const STRING *src, const ENCODING *encoding, + STRING *src, const ENCODING *encoding, const CHARTYPE *type, STRING **dest_ptr) { STRING *dest; @@ -283,9 +320,9 @@ } } - srcstart = (void *)src->bufstart; + srcstart = (void *)src->strstart; srcend = srcstart + src->bufused; - deststart = dest->bufstart; + deststart = dest->strstart; destend = deststart; while (srcstart < srcend) { @@ -319,7 +356,7 @@ INTVAL string_compute_strlen(STRING *s) { - s->strlen = s->encoding->characters(s->bufstart, s->bufused); + s->strlen = s->encoding->characters(s->bufstart, s->bufused) - +((UINTVAL)s->strstart - (UINTVAL)s->bufstart); return s->strlen; } @@ -342,9 +379,9 @@ } result = string_make(interpreter, NULL, a->bufused + b->bufused, a->encoding, 0, a->type); - mem_sys_memcopy(result->bufstart, a->bufstart, a->bufused); - mem_sys_memcopy((void *)((ptrcast_t)result->bufstart + a->bufused), - b->bufstart, b->bufused); + mem_sys_memcopy(result->strstart, a->strstart, a->bufused); + mem_sys_memcopy((void *)((ptrcast_t)result->strstart + a->bufused), + b->strstart, b->bufused); result->strlen = a->strlen + b->strlen; result->bufused = a->bufused + b->bufused; } @@ -388,8 +425,8 @@ /* copy s into dest num times */ for (i = 0; i < num; i++) { - mem_sys_memcopy((void *)((ptrcast_t)dest->bufstart + s->bufused * i), - s->bufstart, s->bufused); + mem_sys_memcopy((void *)((ptrcast_t)dest->strstart + s->bufused * i), + s->strstart, s->bufused); } dest->bufused = s->bufused * num; @@ -406,7 +443,7 @@ * Allocate memory for d if necessary. */ STRING * -string_substr(struct Parrot_Interp *interpreter, const STRING *src, +string_substr(struct Parrot_Interp *interpreter, STRING *src, INTVAL offset, INTVAL length, STRING **d) { STRING *dest; @@ -424,7 +461,6 @@ return string_make(interpreter, NULL, 0, src->encoding, 0, src->type); } - true_length = (UINTVAL)length; if (offset < 0) { true_offset = (UINTVAL)(src->strlen + offset); } @@ -433,30 +469,29 @@ internal_exception(SUBSTR_OUT_OF_STRING, "Cannot take substr outside string"); } + + true_length = (UINTVAL)length; if (true_length > (src->strlen - true_offset)) { true_length = (UINTVAL)(src->strlen - true_offset); } - substart_off = (const char *)src->encoding->skip_forward(src->bufstart, + substart_off = (const char *)src->encoding->skip_forward(src->strstart, true_offset) - - (char *)src->bufstart; + (char *)src->strstart; subend_off = - (const char *)src->encoding->skip_forward((char *)src->bufstart + + (const char *)src->encoding->skip_forward((char *)src->strstart + substart_off, true_length) - - (char *)src->bufstart; - - dest = - string_make(interpreter, NULL, true_length * src->encoding->max_bytes, - src->encoding, 0, src->type); + (char *)src->strstart; if (subend_off < substart_off) { internal_exception(SUBSTR_OUT_OF_STRING, "subend somehow is less than substart"); } - mem_sys_memcopy(dest->bufstart, (char *)src->bufstart + substart_off, - (unsigned)(subend_off - substart_off)); + /* do in-place if possible */ + dest = make_COW_reference(interpreter, src); + dest->strstart = (char *)dest->strstart + substart_off; dest->bufused = subend_off - substart_off; dest->strlen = true_length; @@ -515,14 +550,14 @@ } /* Save the substring that is replaced for the return value */ - substart_off = (const char *)src->encoding->skip_forward(src->bufstart, + substart_off = (const char *)src->encoding->skip_forward(src->strstart, true_offset) - - (char *)src->bufstart; + (char *)src->strstart; subend_off = - (const char *)src->encoding->skip_forward((char *)src->bufstart + + (const char *)src->encoding->skip_forward((char *)src->strstart + substart_off, true_length) - - (char *)src->bufstart; + (char *)src->strstart; if (subend_off < substart_off) { internal_exception(SUBSTR_OUT_OF_STRING, @@ -533,7 +568,7 @@ string_make(interpreter, NULL, true_length * src->encoding->max_bytes, src->encoding, 0, src->type); - mem_sys_memcopy(dest->bufstart, (char *)src->bufstart + substart_off, + mem_sys_memcopy(dest->strstart, (char *)src->strstart + substart_off, (unsigned)(subend_off - substart_off)); dest->bufused = subend_off - substart_off; dest->strlen = true_length; @@ -553,15 +588,17 @@ if(diff >= 0 || ((INTVAL)src->bufused - (INTVAL)src->buflen) <= diff) { + unmake_COW(interpreter, src); + if(diff != 0) { - mem_sys_memmove((char*)src->bufstart + substart_off + rep->bufused, - (char*)src->bufstart + subend_off, + mem_sys_memmove((char*)src->strstart + substart_off + rep->bufused, + (char*)src->strstart + subend_off, src->buflen - (subend_off - diff)); src->bufused -= diff; } - mem_sys_memcopy((char*)src->bufstart + substart_off, - rep->bufstart, rep->bufused); + mem_sys_memcopy((char*)src->strstart + substart_off, + rep->strstart, rep->bufused); if(diff != 0) (void)string_compute_strlen(src); } @@ -575,11 +612,11 @@ /* Move the end of old string that isn't replaced to new offset * first */ - mem_sys_memmove((char*)src->bufstart + subend_off + diff, - (char*)src->bufstart + subend_off, + mem_sys_memmove((char*)src->strstart + subend_off + diff, + (char*)src->strstart + subend_off, src->buflen - subend_off); /* Copy the replacement in */ - mem_sys_memcopy((char *)src->bufstart + substart_off, rep->bufstart, + mem_sys_memcopy((char *)src->strstart + substart_off, rep->strstart, rep->bufused); src->bufused += diff; (void)string_compute_strlen(src); @@ -595,8 +632,8 @@ STRING * string_chopn(STRING *s, INTVAL n) { - const char *bufstart = s->bufstart; - const char *bufend = bufstart + s->bufused; + const char *strstart = s->strstart; + const char *bufend = strstart + s->bufused; UINTVAL true_n; true_n = (UINTVAL)n; @@ -609,7 +646,7 @@ bufend = s->encoding->skip_backward(bufend, true_n); - s->bufused = bufend - bufstart; + s->bufused = bufend - strstart; s->strlen = s->strlen - true_n; return s; @@ -645,9 +682,9 @@ NULL); } - s1start = s1->bufstart; + s1start = s1->strstart; s1end = s1start + s1->bufused; - s2start = s2->bufstart; + s2start = s2->strstart; s2end = s2start + s2->bufused; while (cmp == 0 && s1start < s1end && s2start < s2end) { @@ -685,7 +722,7 @@ if (len == 1) { - UINTVAL c = s->encoding->decode(s->bufstart); + UINTVAL c = s->encoding->decode(s->strstart); if (s->type->is_digit(c) && s->type->get_digit(c) == 0) { return 0; @@ -714,7 +751,7 @@ INTVAL i = 0; if (s) { - const char *start = s->bufstart; + const char *start = s->strstart; const char *end = start + s->bufused; int sign = 1; INTVAL in_number = 0; @@ -754,7 +791,7 @@ FLOATVAL f = 0.0; if (s) { - const char *start = s->bufstart; + const char *start = s->strstart; const char *end = start + s->bufused; int sign = 1; INTVAL seen_dot = 0; @@ -879,10 +916,19 @@ { char *cstring; - if (s->buflen == s->bufused) + /* We shouldn't modify a constant string, + * so instead create a new copy of it */ + if (s->flags & BUFFER_constant_FLAG) { + s = make_COW_reference(interpreter,s); + } + + unmake_COW(interpreter, s); + + if (s->buflen == s->bufused) { string_grow(interpreter, s, 1); + } - cstring = s->bufstart; + cstring = s->strstart; cstring[s->bufused] = 0; Index: trace.c =================================================================== RCS file: /cvs/public/parrot/trace.c,v retrieving revision 1.18 diff -u -r1.18 trace.c --- trace.c 23 Jul 2002 02:09:27 -0000 1.18 +++ trace.c 19 Aug 2002 02:28:42 -0000 @@ -48,7 +48,7 @@ break; case PARROT_ARG_SC: escaped = PDB_escape(interpreter->code->const_table-> - constants[*(pc + i)]->string->bufstart, + constants[*(pc + i)]->string->strstart, interpreter->code->const_table-> constants[*(pc + i)]->string->strlen); fprintf(stderr, "\"%s\"", escaped); @@ -73,7 +73,7 @@ case PARROT_ARG_S: if (interpreter->ctx.string_reg.registers[*(pc + i)]) { escaped = PDB_escape(interpreter->ctx.string_reg. - registers[*(pc + i)]->bufstart, + registers[*(pc + i)]->strstart, interpreter->ctx.string_reg. registers[*(pc + i)]->strlen); fprintf(stderr, "S%ld=\"%s\"", (long)*(pc + i), Index: warnings.c =================================================================== RCS file: /cvs/public/parrot/warnings.c,v retrieving revision 1.10 diff -u -r1.10 warnings.c --- warnings.c 23 Jul 2002 02:09:27 -0000 1.10 +++ warnings.c 19 Aug 2002 02:28:43 -0000 @@ -37,7 +37,7 @@ return -1; } - if (PIO_write(interpreter, PIO_STDERR(interpreter), targ->bufstart, + if (PIO_write(interpreter, PIO_STDERR(interpreter), targ->strstart, targ->bufused) < 0) { return -2; } @@ -77,7 +77,7 @@ return -1; } - if (PIO_write(interpreter, PIO_STDERR(interpreter), targ->bufstart, + if (PIO_write(interpreter, PIO_STDERR(interpreter), targ->strstart, targ->bufused) < 0) { return -2; } Index: classes/perlstring.pmc =================================================================== RCS file: /cvs/public/parrot/classes/perlstring.pmc,v retrieving revision 1.27 diff -u -r1.27 perlstring.pmc --- classes/perlstring.pmc 8 Aug 2002 20:57:48 -0000 1.27 +++ classes/perlstring.pmc 19 Aug 2002 02:28:43 -0000 @@ -72,7 +72,7 @@ STRING* s2 = (STRING*)other->data; return (INTVAL)( other->vtable == SELF->vtable && s1->bufused == s2->bufused && - (memcmp(s1->bufstart,s2->bufstart,(size_t)s1->bufused)==0)); + (memcmp(s1->strstart,s2->bufstart,(size_t)s1->bufused)==0)); } void set_integer (PMC* value) { Index: docs/jit.pod =================================================================== RCS file: /cvs/public/parrot/docs/jit.pod,v retrieving revision 1.5 diff -u -r1.5 jit.pod --- docs/jit.pod 1 Aug 2002 19:57:24 -0000 1.5 +++ docs/jit.pod 19 Aug 2002 02:28:44 -0000 @@ -164,9 +164,9 @@ Gets replaced by the C<FLOATVAL> constant specified in the I<n>th argument. -B<STRING_CONST_bufstart[n]> +B<STRING_CONST_strstart[n]> -Gets replaced by C<bufstart> of the C<STRING> constant specified in the I<n>th argument. +Gets replaced by C<strstart> of the C<STRING> constant specified in the I<n>th +argument. B<STRING_CONST_buflen[n]> @@ -285,7 +285,7 @@ Parrot_print_sc { movl $1,&TEMP_INT[1] - SYSTEMCALL(WRITE,3, A&TEMP_INT[1] V&STRING_CONST_bufstart[1] V*STRING_CONST_strlen[1]) + SYSTEMCALL(WRITE,3, A&TEMP_INT[1] V&STRING_CONST_strstart[1] +V*STRING_CONST_strlen[1]) } Parrot_end { Index: docs/strings.pod =================================================================== RCS file: /cvs/public/parrot/docs/strings.pod,v retrieving revision 1.8 diff -u -r1.8 strings.pod --- docs/strings.pod 10 Jan 2002 23:23:03 -0000 1.8 +++ docs/strings.pod 19 Aug 2002 02:28:44 -0000 @@ -162,15 +162,16 @@ how the C<STRING> structure works. You can find the definition of this structure in F<string.h>: - struct parrot_string { - void *bufstart; - INTVAL buflen; - INTVAL bufused; - INTVAL flags; - INTVAL strlen; - INTVAL encoding; - INTVAL type; - INTVAL unused; + struct parrot_string_t { + void *bufstart; + UINTVAL buflen; + UINTVAL flags; + UINTVAL bufused; + void *strstart; + UINTVAL strlen; + const ENCODING *encoding; + const CHARTYPE *type; + INTVAL language; }; Let's look at each element of this structure in turn. @@ -187,6 +188,11 @@ This is used for memory allocation; it tells you the currently allocated size of the buffer in bytes. +=head2 C<flags> + +This is a general holding area for string flags. The exact flags +required have not yet been determined. + =head2 C<bufused> C<bufused> on the other hand, contains the number of bytes out of the @@ -194,10 +200,11 @@ C<buflen>, is used by the buffer growing algorithm to determine when and by how much to grow the allocation buffer. -=head2 C<flags> +=head2 C<strstart> -This is a general holding area for string flags. The exact flags -required have not yet been determined. +This stores the actual start of the string. In the case of COW strings +holding references to portions of a larger string, (for example, in regex +match variables), this is a pointer into the start of the string. =head2 C<strlen> @@ -242,9 +249,9 @@ XXX I don't know what this is for. -=head2 C<unused> +=head2 C<language> -This field is, as its name suggests, unused; however, it can be used to +This field is currently unused; however, it can be used to hold a pointer to the correct vtable for foreign strings. =head1 String Vtable Functions Index: examples/assembly/life.pasm =================================================================== RCS file: /cvs/public/parrot/examples/assembly/life.pasm,v retrieving revision 1.11 diff -u -r1.11 life.pasm --- examples/assembly/life.pasm 24 Jun 2002 16:40:56 -0000 1.11 +++ examples/assembly/life.pasm 19 Aug 2002 02:28:45 -0000 @@ -5,7 +5,7 @@ # of life # # First the generation count - set I2, 500 + set I2, 5000 # Note the time time N5 # If true, we don't print Index: include/parrot/resources.h =================================================================== RCS file: /cvs/public/parrot/include/parrot/resources.h,v retrieving revision 1.37 diff -u -r1.37 resources.h --- include/parrot/resources.h 3 Aug 2002 07:35:36 -0000 1.37 +++ include/parrot/resources.h 19 Aug 2002 02:28:45 -0000 @@ -29,7 +29,9 @@ void (*compact)(struct Parrot_Interp *, struct Memory_Pool *); size_t minimum_block_size; size_t total_allocated; /* total bytes allocated to this pool */ - size_t reclaimable; /* bytes that can be reclaimed (approximate) */ + size_t guaranteed_reclaimable; /* bytes that can definitely be reclaimed*/ + size_t possibly_reclaimable; /* bytes that can possibly be reclaimed + * (above plus COW-freed bytes) */ FLOATVAL reclaim_factor; /* minimum percentage we will reclaim */ }; Index: include/parrot/smallobject.h =================================================================== RCS file: /cvs/public/parrot/include/parrot/smallobject.h,v retrieving revision 1.5 diff -u -r1.5 smallobject.h --- include/parrot/smallobject.h 28 Jul 2002 23:24:45 -0000 1.5 +++ include/parrot/smallobject.h 19 Aug 2002 02:28:45 -0000 @@ -36,6 +36,7 @@ void *mem_pool; size_t start_arena_memory; size_t end_arena_memory; + STRING* name; }; INTVAL contained_in_pool(struct Parrot_Interp *, @@ -59,7 +60,7 @@ void alloc_objects(struct Parrot_Interp *, struct Small_Object_Pool *); struct Small_Object_Pool * new_small_object_pool(struct Parrot_Interp *, - size_t, size_t); + size_t, size_t, STRING *); struct Small_Object_Pool * get_sized_small_object_pool(struct Parrot_Interp *, size_t); Index: include/parrot/string.h =================================================================== RCS file: /cvs/public/parrot/include/parrot/string.h,v retrieving revision 1.43 diff -u -r1.43 string.h --- include/parrot/string.h 18 Jul 2002 04:30:42 -0000 1.43 +++ include/parrot/string.h 19 Aug 2002 02:28:45 -0000 @@ -26,6 +26,7 @@ UINTVAL buflen; UINTVAL flags; UINTVAL bufused; + void *strstart; UINTVAL strlen; const ENCODING *encoding; const CHARTYPE *type; @@ -45,6 +46,15 @@ typedef struct parrot_string_t String; + +/* Tail added to end of string buffers; used for COW GC */ +struct Buffer_Tail { + unsigned char flags; +}; +typedef enum TAIL_flag { + TAIL_moved_FLAG = 1 << 0, +} TAIL_flags; + /* Buffer flags */ typedef enum BUFFER_flag { /* bits the GC can keep its dirty mitts off of */ @@ -76,12 +86,18 @@ /* For debugging, report when this buffer gets moved around */ BUFFER_report_FLAG = 1 << 16, /* Generation in the GC pools */ - BUFFER_generation_FLAG = 1 << 17 | 1 << 18 + BUFFER_generation_FLAG = 1 << 17 | 1 << 18, + /* Buffer header has a strstart which needs to be updated with bufstart */ + BUFFER_strstart_FLAG = 1 << 19, + /* Buffer's memory data is in this header's header pool's memory pool */ + /* for now, this is true in constant headers for constant buffer data,and + * true for non-constant headers pointing at non-constant buffer data */ + BUFFER_selfpoolptr_FLAG = 1 << 20, } BUFFER_flags; /* stringinfo parameters */ #define STRINGINFO_HEADER 1 -#define STRINGINFO_BUFSTART 2 +#define STRINGINFO_STRSTART 2 #define STRINGINFO_BUFLEN 3 #define STRINGINFO_FLAGS 4 #define STRINGINFO_BUFUSED 5 Index: include/parrot/string_funcs.h =================================================================== RCS file: /cvs/public/parrot/include/parrot/string_funcs.h,v retrieving revision 1.13 diff -u -r1.13 string_funcs.h --- include/parrot/string_funcs.h 24 Jun 2002 16:41:28 -0000 1.13 +++ include/parrot/string_funcs.h 19 Aug 2002 02:28:45 -0000 @@ -23,7 +23,7 @@ STRING *string_repeat(struct Parrot_Interp *, const STRING *, UINTVAL, STRING **); STRING *string_chopn(STRING *, INTVAL); -STRING *string_substr(struct Parrot_Interp *, const STRING *, INTVAL, +STRING *string_substr(struct Parrot_Interp *, STRING *, INTVAL, INTVAL, STRING **); STRING *string_replace(struct Parrot_Interp *, STRING *, INTVAL, INTVAL, STRING *, STRING **); @@ -43,8 +43,8 @@ STRING *string_make(struct Parrot_Interp *, const void *buffer, UINTVAL buflen, const ENCODING *, UINTVAL flags, const CHARTYPE *); -STRING *string_copy(struct Parrot_Interp *, const STRING *); -STRING *string_transcode(struct Parrot_Interp *, const STRING *src, +STRING *string_copy(struct Parrot_Interp *, STRING *); +STRING *string_transcode(struct Parrot_Interp *, STRING *src, const ENCODING *, const CHARTYPE *, STRING **dest_ptr); void string_init(void); Index: t/op/gc.t =================================================================== RCS file: /cvs/public/parrot/t/op/gc.t,v retrieving revision 1.2 diff -u -r1.2 gc.t --- t/op/gc.t 3 Aug 2002 07:30:09 -0000 1.2 +++ t/op/gc.t 19 Aug 2002 02:28:46 -0000 @@ -85,3 +85,4 @@ starting ending OUTPUT + Index: t/op/string.t =================================================================== RCS file: /cvs/public/parrot/t/op/string.t,v retrieving revision 1.29 diff -u -r1.29 string.t --- t/op/string.t 27 Jul 2002 20:18:12 -0000 1.29 +++ t/op/string.t 19 Aug 2002 02:28:47 -0000 @@ -1,6 +1,6 @@ #! perl -w -use Parrot::Test tests => 87; +use Parrot::Test tests => 90; use Test::More; output_is( <<'CODE', <<OUTPUT, "set_s_s|sc" ); @@ -1289,6 +1289,56 @@ 0 OUTPUT + +output_is( <<'CODE', <<OUTPUT, "concat/substr (COW)" ); + set S0, "<JA" + set S1, "PH>" + set S2, "" + concat S2, S2, S0 + concat S2, S2, S1 + print S2 + print "\n" + substr S0, S2, 1, 4 + print S0 + print "\n" + end +CODE +<JAPH> +JAPH +OUTPUT + +output_is( <<'CODE', <<OUTPUT, "constant to cstring" ); + stringinfo I0, "\n", 2 + stringinfo I1, "\n", 2 + eq I1, I0, ok1 + print "N" +ok1: + print "OK" + print "\n" + stringinfo I2, "\n", 2 + eq I2, I0, ok2 + print "N" +ok2: + print "OK\n" + end +CODE +OK +OK +OUTPUT + +output_is( <<'CODE', <<OUTPUT, "COW with chopn leaving original untouched" ); + set S0, "ABCD" + clone S1, S0 + chopn S0, 1 + print S0 + print "\n" + print S1 + print "\n" + end +CODE +ABC +ABCD +OUTPUT # Set all string registers to values given by &$_[0](reg num) sub set_str_regs {