Re: [perl #57260] [BUG] Segfaults in sprintf opcode
- Original Message - From: "Christoph Otto" <[EMAIL PROTECTED]> To: Sent: Friday, July 25, 2008 10:29 AM Subject: Re: [perl #57260] [BUG] Segfaults in sprintf opcode Will Coleda (via RT) wrote: # New Ticket Created by Will Coleda # Please include the string: [perl #57260] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=57260 > Tripped over these trying to run some spec tests for Tcl. I have no idea what $S0 should have after the invocation, but the current segfaults are undoubtedly incorrect. > If someone wants to fix this, the following works as a minimal test case. .sub kaboom $P0 = new 'ResizableIntegerArray' push $P0, -1 push $P0, 1 $S0 = sprintf "%*c", $P0 .end According to 'man sprintf', a negative field length is equivalent to a minus option and the absolute length. The following patch has been applied in revision 29735. Appropriate tests to be added later. Regards Peter Gibbs Index: src/spf_render.c === --- src/spf_render.c(revision 29734) +++ src/spf_render.c(working copy) @@ -310,6 +310,7 @@ INTVAL len = 0; INTVAL old = 0; INTVAL pat_len = (INTVAL)string_length(interp, pat); +HUGEINTVAL num; /* start with a buffer; double the pattern length to avoid realloc #1 */ STRING *targ = string_make_empty(interp, enum_stringrep_one, pat_len << 1); @@ -492,8 +493,14 @@ case '*': info.flags |= FLAG_WIDTH; -info.width = (UINTVAL)obj->getint(interp, - SIZE_XVAL, obj); +num = obj->getint(interp, SIZE_XVAL, obj); +if (num < 0) { +info.flags |= FLAG_MINUS; +info.width = -num; +} +else { +info.width = num; +} /* fall through */ case '.':
Re: [perl #57260] [BUG] Segfaults in sprintf opcode
- Original Message - From: "Geoffrey Broadwell" <[EMAIL PROTECTED]> To: "Peter Gibbs" <[EMAIL PROTECTED]> Cc: "Christoph Otto" <[EMAIL PROTECTED]>; Sent: Friday, July 25, 2008 6:09 PM Subject: Re: [perl #57260] [BUG] Segfaults in sprintf opcode On Fri, 2008-07-25 at 13:40 +0200, Peter Gibbs wrote: +HUGEINTVAL num; Does this really need to be a HUGEINTVAL? Why is INTVAL not sufficient? Tracking backwards from the assignment in Parrot_sprintf_format: num = obj->getint(interp, SIZE_XVAL, obj); obj is a parameter to Parrot_sprintf_format defined by: ARGIN(SPRINTF_OBJ *obj) This is defined in include/parrot/misc.h as: typedef struct sprintf_obj SPRINTF_OBJ; struct sprintf_obj { ... sprintf_getint_t getint; ... } and finally: typedef HUGEINTVAL(*sprintf_getint_t) (PARROT_INTERP,INTVAL, SPRINTF_OBJ *); So, since obj->getint returns a HUGEINTVAL, I gave it one to store the result in. As to why sprintf_obj is defined that way, I have no clue. Regards Peter Gibbs
Re: [perl #57260] [BUG] Segfaults in sprintf opcode
- Original Message - From: "Christoph Otto" <[EMAIL PROTECTED]> To: Sent: Friday, July 25, 2008 6:43 PM Subject: Re: [perl #57260] [BUG] Segfaults in sprintf opcode Will Coleda wrote: Once we get a core parrot test for this, we can close out the ticket. Thanks! I tried writing a test to exercise this, but the test fails and I have no idea why. I've attached the patch for anyone who cares to add sufficient magic to make the test pass. The current logic in sprintf.t does not handle multiple arguments in the data. I started on something to handle a simple array, but there are other tests defined in sprintf_tests with more complicated expressions inside square brackets, so it was more work than I had time for. We need either a full parser for the perl-style tests, or a simpler test harness that doesn't try to do the perl-specific stuff. Regards Peter Gibbs
Re: [perl #57260] [BUG] Segfaults in sprintf opcode
- Original Message - From: "Will Coleda" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Friday, July 25, 2008 10:31 PM Subject: Re: [perl #57260] [BUG] Segfaults in sprintf opcode On Fri, Jul 25, 2008 at 4:23 PM, Peter Gibbs via RT <[EMAIL PROTECTED]> wrote: - Original Message - From: "Christoph Otto" <[EMAIL PROTECTED]> To: Sent: Friday, July 25, 2008 6:43 PM Subject: Re: [perl #57260] [BUG] Segfaults in sprintf opcode Will Coleda wrote: Once we get a core parrot test for this, we can close out the ticket. Thanks! I tried writing a test to exercise this, but the test fails and I have no idea why. I've attached the patch for anyone who cares to add sufficient magic to make the test pass. The current logic in sprintf.t does not handle multiple arguments in the data. I started on something to handle a simple array, but there are other tests defined in sprintf_tests with more complicated expressions inside square brackets, so it was more work than I had time for. We need either a full parser for the perl-style tests, or a simpler test harness that doesn't try to do the perl-specific stuff. Regards Peter Gibbs In the meantime, adding a test file called, say sprintf_ext.t would allow us to write a test and close out this ticket. sprintf2.t added in revision 29738 with a few simple multi-argument tests. More probably need to be added. Regards Peter Gibbs
Re: Parrot_sprintf_c question.
- Original Message - From: "Will Coleda" <[EMAIL PROTECTED]> To properly support $tcl_precision in tcl, I need to change how I'm currently implementing {$tcl_precision == 0}. Right now, I just fake it by setting the precision to 16, but that isn't right. where something like {expr acos(0)} should output: 1.5707963267948966 it outputs (with the above C code) 1.5708 And in partcl's current state, it outputs 1.570796326794897 A few points to ponder. 1) The default precision for %g in C is 6 significant digits, i.e. 1.57080 (but sprintf drops trailing zeroes) 2) 1.5707963267948966 actually corresponds to 17-digit precision (which is the limit for double), not 16 3) tcl uses its own special handling for tcl_precision = 0, refer to http://www.tcl.tk/cgi-bin/tct/tip/132.html for more information 4) Look at David Gay's code (see http://www.netlib.org/fp/index.html, specifically gdtoa) for similar logic 5) Although I haven't looked, the tcl source code must contain their actual implementation I don't know whether we expect enough usage of Parrot in mathemetical circles to be worthwhile adding one of these implementions to our version of sprintf. Regards Peter Gibbs
Re: [perl #57504] [PATCH][Lua] Fixed 64bit bug in Lua bytecode decoder/translator.
- Original Message - From: "Robert G. Jakabosky (via RT)" <[EMAIL PROTECTED]> Subject: [perl #57504] [PATCH][Lua] Fixed 64bit bug in Lua bytecode decoder/translator. Changes to 'languages/lua/t/os.t': 1 test from 'os.t' fails on 64bit systems because 'os.time()' does not return a 'nil' for dates with year < 1970. 'os.time()' only returns 'nil' when mktime returns ((time_t)-1). On 32bit systems mktime uses -1 for dates outside year range 1970-2038. On 64bit systems mktime returns negative values for dates before year 1970, so there is only one date that will cause mktime to return -1 and make 'os.time()' in lua return 'nil'. I have updated the test to use that one date that returns 'nil' on both 32bit & 64bit systems. Unfortunately this solution only works in a specific time zone. I have committed (rev 29938) an alteration to this patch, which calculates the date/time corresponding to -1. This should now be self-adjusting for various time zones. Regards Peter Gibbs
Correction to string patch
David In your last change (splitting buffer allocation from string) I assume you also intended to shorten the initial allocation. Peter Gibbs EmKel Systems Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.30 diff -c -r1.30 string.c *** string.c30 Dec 2001 21:08:48 - 1.30 --- string.c31 Dec 2001 06:55:56 - *** *** 44,50 encoding = encoding_lookup(type->default_encoding); } ! s = mem_sys_allocate(sizeof(STRING)+buflen); s->bufstart = mem_sys_allocate(buflen+1); s->encoding = encoding; s->flags = flags; --- 44,50 encoding = encoding_lookup(type->default_encoding); } ! s = mem_sys_allocate(sizeof(STRING)); s->bufstart = mem_sys_allocate(buflen+1); s->encoding = encoding; s->flags = flags; --
Another correction to string patch
David You have also forgotten to free the second allocation. I see that you call free_string(), which is in resources.c, but don't use the matching new_string_header() function in that file - are these not intended to be a matched pair for future GC purposes? I am assuming for now that both free() calls should go in free_string() rather than one in string_destroy() but I don't know exactly what you and/or Dan intended here. Peter Gibbs EmKel Systems Index: resources.c === RCS file: /home/perlcvs/parrot/resources.c,v retrieving revision 1.2 diff -c -r1.2 resources.c *** resources.c 13 Dec 2001 00:51:10 - 1.2 --- resources.c 31 Dec 2001 20:03:28 - *** *** 27,31 --- 27,32 } void free_string(STRING *string) { + free(string->bufstart); free(string); }
Re: Another correction to string patch
- Original Message - From: "Dan Sugalski" <[EMAIL PROTECTED]> > Yup. Destroyed strings just get their buffer pointers set to NULL. GC'll > collect things up. Also means COW string buffers can share pointers to the > same buffer. > Dan/David With regard to COW strings - would the buffer-related parameters (eg bufstart, buflen, bufused) then not be linked to the buffer rather than the string? ie should we have two structs (STRING & BUFFER) rather than one combined one, with STRING pointing to BUFFER rather than directly to memory? Peter Gibbs
[PATCH] string_transcode
I suspect that the length of the output buffer for string_transcode should be based on the output encoding, not on the input encoding. Peter Gibbs EmKel Systems Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.33 diff -c -r1.33 string.c *** string.c31 Dec 2001 17:23:03 - 1.33 --- string.c1 Jan 2002 10:11:56 - *** *** 154,160 return string_copy(interpreter, src); } ! dest = string_make(interpreter, NULL, src->strlen*src->encoding->max_bytes , encoding, 0, type); if (src->type != dest->type) { --- 154,160 return string_copy(interpreter, src); } ! dest = string_make(interpreter, NULL, src->strlen*encoding->max_bytes, encoding, 0, type); if (src->type != dest->type) {
[PATCH] string_transcode
Another correction to string_transcode; this function now seems to work okay (tested using a dummy 'encode' op added to my local copy of core.ops) Peter Gibbs EmKel Systems Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.35 diff -c -r1.35 string.c *** string.c1 Jan 2002 17:53:50 - 1.35 --- string.c1 Jan 2002 21:21:22 - *** *** 179,185 srcstart = (void*)src->bufstart; srcend = srcstart + src->bufused; deststart = dest->bufstart; ! destend = deststart + dest->buflen; while (srcstart < srcend) { INTVAL c = src->encoding->decode(srcstart); --- 179,185 srcstart = (void*)src->bufstart; srcend = srcstart + src->bufused; deststart = dest->bufstart; ! destend = deststart; while (srcstart < srcend) { INTVAL c = src->encoding->decode(srcstart); *** *** 187,193 if (transcoder1) c = transcoder1(c); if (transcoder2) c = transcoder2(c); ! deststart = dest->encoding->encode(deststart, c); srcstart = src->encoding->skip_forward(srcstart, 1); } --- 187,193 if (transcoder1) c = transcoder1(c); if (transcoder2) c = transcoder2(c); ! destend = dest->encoding->encode(destend, c); srcstart = src->encoding->skip_forward(srcstart, 1); }
Re: TODOs for STRINGs
- Original Message - From: "David & Lisa Jacobs" <[EMAIL PROTECTED]> > * Add transcoding ops (this might be a specific case of the previous e.g., > set S0, S1, "unicode", "utf-16") Note that there is still a bug in string_transcode as of string.c 1.35; I have repeated a patch below. I tested this with a separate opcode, but I like your idea of using set. > * Add size of string termination to encodings (i.e., how many 0 bytes) Should string termination be required? If strings are assumed to be terminated, it seems to me that precludes buffer re-use (copy-on-write) for substrings. On the other hand, passing strings to native C library routines requires the termination. One compromise would be for 'unique' strings to be terminated, and do 'copy-on-read' if a non-unique string (i.e. one borrowing a buffer) is passed to an external function. Peter Gibbs EmKel Systems Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.35 diff -c -r1.35 string.c *** string.c 1 Jan 2002 17:53:50 - 1.35 --- string.c 2 Jan 2002 07:27:16 - *** *** 179,185 srcstart = (void*)src->bufstart; srcend = srcstart + src->bufused; deststart = dest->bufstart; ! destend = deststart + dest->buflen; while (srcstart < srcend) { INTVAL c = src->encoding->decode(srcstart); --- 179,185 srcstart = (void*)src->bufstart; srcend = srcstart + src->bufused; deststart = dest->bufstart; ! destend = deststart; while (srcstart < srcend) { INTVAL c = src->encoding->decode(srcstart); *** *** 187,193 if (transcoder1) c = transcoder1(c); if (transcoder2) c = transcoder2(c); ! deststart = dest->encoding->encode(deststart, c); srcstart = src->encoding->skip_forward(srcstart, 1); } --- 187,193 if (transcoder1) c = transcoder1(c); if (transcoder2) c = transcoder2(c); ! destend = dest->encoding->encode(destend, c); srcstart = src->encoding->skip_forward(srcstart, 1); }
Possible error in key_hash
Jeff If I understand the STRING struct correctly, buflen is the physical allocation of the buffer, and bufused is the number of bytes actually used by the string; therefore the latter would be correct for key_hash? Peter Gibbs EmKel Systems Index: key.c === RCS file: /home/perlcvs/parrot/key.c,v retrieving revision 1.8 diff -c -r1.8 key.c *** key.c 8 Jan 2002 06:34:49 - 1.8 --- key.c 8 Jan 2002 15:40:55 - *** *** 93,99 static INTVAL key_hash(struct Parrot_Interp *interpreter, STRING* value) { char* buffptr = value->bufstart; ! INTVAL len= value->buflen; INTVAL hash = 5893; while(len--) { --- 93,99 static INTVAL key_hash(struct Parrot_Interp *interpreter, STRING* value) { char* buffptr = value->bufstart; ! INTVAL len= value->bufused; INTVAL hash = 5893; while(len--) {
Re: [PATCH] Minor fixes to rx.c
> > I would strongly recommend that perl6 mandates that buffers are not nul > terminated. Anything that needs a nul should arrange for one to be appended. > [eg by ensuring that the buffer is writable, extending it by one byte if > needs be, and writing that nul, or by copying out the contents.] > If this recommendation is to be implemented, it probably needs to be done sooner rather than later, before too much code appears that needs to be changed. I would suggest implementing a function to supply a terminated string if required, something like string_nt() below; this returns a char* rather than a STRING* because terminated strings should only be used for passing outside parrot. s->buflen has been changed to be the true allocated length; the extra byte could be removed along with the actual null-termination once nobody relies on it; however, if memory allocation will be happening on fixed boundaries, it may make sense to round the buffer length up to match. -- Peter Gibbs EmKel Systems Index: include/parrot/string.h === RCS file: /home/perlcvs/parrot/include/parrot/string.h,v retrieving revision 1.20 diff -c -r1.20 string.h *** include/parrot/string.h 7 Jan 2002 22:09:15 - 1.20 --- include/parrot/string.h 10 Jan 2002 18:18:23 - *** *** 67,72 --- 67,74 STRING* string_transcode(struct Parrot_Interp *interpreter, const STRING *src, const ENCODING *encoding, const CHARTYPE *type, STRING **d); + char* + string_nt(struct Parrot_Interp *interpreter, const STRING *s); void string_init(void); INTVAL Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.38 diff -c -r1.38 string.c *** string.c 9 Jan 2002 22:35:14 - 1.38 --- string.c 10 Jan 2002 18:18:33 - *** *** 49,55 s->encoding = encoding; s->flags = flags; s->type = type; ! s->buflen = buflen; if (buffer) { mem_sys_memcopy(s->bufstart, buffer, buflen); --- 49,55 s->encoding = encoding; s->flags = flags; s->type = type; ! s->buflen = buflen+1; if (buffer) { mem_sys_memcopy(s->bufstart, buffer, buflen); *** *** 201,206 --- 201,221 } return dest; + } + + /*=for api string string_nt + * return pointer to null-terminated character string + * this may be the original string's buffer or a copy thereof + */ + char* + string_nt(struct Parrot_Interp *interpreter, const STRING *s) { + + if (s->buflen < s->bufused+1) { + s = string_make(interpreter, s->bufstart, s->bufused+1, + s->encoding, s->flags, s->type); + } + memset((char *)s->bufstart+s->bufused,0,1); + return s->bufstart; } /* vtable despatch functions */
GC discarding free pmc pool?
Dan I have been reading memory.c and resources.c trying to understand how memory allocation and garbage collection works, and have found what looks to be a bug. The initial allocation routine in mem_setup_allocator creates two blocks, one for the free string header pool and one for the free pmc header pool. However, Parrot_go_collect copies only the free string header pool to the newly allocated block (followed by the compacted strings), then proceeds to free the entire memory pool - what happens to the free pmc header pool?? I have done a simple test by adding a " new P1, PerlInt" to life.pasm in the middle of the stats output lines, and on my linux x86 system this gives a segfault, which seems to confirm that something has gone awry somewhere. -- Peter Gibbs EmKel Systems
Re: Anyone using the current regex ops?
Dan I assume you meant pmc_pool not buffer_header_pool ?? -- Peter Gibbs EmKel Systems Index: resources.c === RCS file: /home/perlcvs/parrot/resources.c,v retrieving revision 1.30 diff -c -r1.30 resources.c *** resources.c 18 Mar 2002 16:42:06 - 1.30 --- resources.c 18 Mar 2002 19:21:21 - *** *** 702,711 /* Collect the PMC header pool */ memcpy(cur_spot, !interpreter->arena_base->buffer_header_pool->pool_buffer.bufstart, !interpreter->arena_base->buffer_header_pool->pool_buffer.buflen); ! interpreter->arena_base->buffer_header_pool->pool_buffer.bufstart = cur_spot ; ! cur_size = interpreter->arena_base->buffer_header_pool->pool_buffer.buflen; if (cur_size & 0x0f) { cur_size &= ~0x0f; cur_size += 16; --- 702,711 /* Collect the PMC header pool */ memcpy(cur_spot, !interpreter->arena_base->pmc_pool->pool_buffer.bufstart, !interpreter->arena_base->pmc_pool->pool_buffer.buflen); ! interpreter->arena_base->pmc_pool->pool_buffer.bufstart = cur_spot; ! cur_size = interpreter->arena_base->pmc_pool->pool_buffer.buflen; if (cur_size & 0x0f) { cur_size &= ~0x0f; cur_size += 16;
Re: [PATCH] Stack bugfix
- Original Message - From: "Melvin Smith" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: "dan Sugalski" <[EMAIL PROTECTED]> Sent: 26 March 2002 09:39 Subject: [PATCH] Stack bugfix > > Afterwards the test program ran with larger file sizes, eventually > crashing again, but this time in the GC. <*PUNT* Dan goes back to the 10 > yard line.> > > ../parrot reverse.pbc < string.c > > > > recurse depth 0 > Segmentation fault (core dumped) > This seems to caused by a nice timing problem - the string header has been allocated, then attempting to allocate the string buffer triggers a garbage collection. The string header still points to wherever it did before it was freed, and has already been marked as live, so the GC code tries to copy the old buffer contents to the new memory pool. Patch below fixes it by clearing the buffer pointer on allocation of a new header; it could alternatively be done when the header is added to the free list. -- Peter Gibbs EmKel Systems Index: resources.c === RCS file: /home/perlcvs/parrot/resources.c,v retrieving revision 1.35 diff -u -r1.35 resources.c --- resources.c 26 Mar 2002 16:33:01 - 1.35 +++ resources.c 27 Mar 2002 07:38:48 - @@ -645,6 +645,8 @@ interpreter->active_Buffers++; /* Mark it live */ return_me->flags = BUFFER_live_FLAG; +/* Make sure it doesn't point anywhere yet */ +return_me->bufstart = NULL; /* Return it */ return return_me; }
GC bugs
I have traced the problems with Josh's deep stack pushp/popp test to two problems with the garbage collector: 1) Strings pointed to by pmc->cache.struct_val are not marked as live 2) If a dod run is initiated while a pmc is being created, it will be freed as nobody points to it yet Since both these problems involve design considerations, I am leaving it to Dan to fix them. Quick hacks to bypass the problems show that the pushp/popp test then works fine. -- Peter Gibbs EmKel Systems
Re: Strings/Stack access hanging (small version)
Clinton The following patch seems to fix both these problems. It makes some slight changes to the logic, so should be considered a temporary fix until Dan has time to take a look at the code. -- Peter Gibbs EmKel Systems Index: resources.c === RCS file: /home/perlcvs/parrot/resources.c,v retrieving revision 1.35 diff -c -r1.35 resources.c *** resources.c 26 Mar 2002 16:33:01 - 1.35 --- resources.c 28 Mar 2002 13:40:37 - *** *** 338,344 /* If the PMC we've been handed has already been marked as live (ie we put it on the list already) we just return. Otherwise we could get in some nasty loops */ ! if (used_pmc->next_for_GC) { return current_end_of_list; } --- 338,344 /* If the PMC we've been handed has already been marked as live (ie we put it on the list already) we just return. Otherwise we could get in some nasty loops */ ! if (used_pmc->next_for_GC || used_pmc == current_end_of_list) { return current_end_of_list; } *** *** 346,352 used_pmc->flags |= PMC_live_FLAG; /* Now put it on the end of the list */ ! current_end_of_list->next_for_GC = used_pmc; /* return the PMC we were passed as the new end of the list */ return used_pmc; --- 346,354 used_pmc->flags |= PMC_live_FLAG; /* Now put it on the end of the list */ ! if (current_end_of_list) { ! current_end_of_list->next_for_GC = used_pmc; ! } /* return the PMC we were passed as the new end of the list */ return used_pmc; *** *** 362,373 struct PRegChunk *cur_chunk; /* We have to start somewhere, and the global stash is a good place */ ! last = current = interpreter->perl_stash->stash_hash; /* mark it as used and get an updated end of list */ last = mark_used(current, last); /* Wipe out the next for gc bit, otherwise we'll never get anywhere */ ! last->next_for_GC = NULL; /* Now, go run through the PMC registers and mark them as live */ /* First mark the current set. */ --- 364,376 struct PRegChunk *cur_chunk; /* We have to start somewhere, and the global stash is a good place */ ! current = interpreter->perl_stash->stash_hash; ! last = NULL; /* mark it as used and get an updated end of list */ last = mark_used(current, last); /* Wipe out the next for gc bit, otherwise we'll never get anywhere */ ! /* last->next_for_GC = NULL; */ /* Now, go run through the PMC registers and mark them as live */ /* First mark the current set. */
Re: Memory Corruption Bug
"Michel J Lambert" <[EMAIL PROTECTED]> wrote: > Attached is a .pasm file which causes some string data to be written into > the middle of the string_pool->pool_buffer list of entries, such that when > it tries to dereference foo in new_pmc_header, it's pointing to garbage > memory. 0x20202020 for me, which is four spaces. Changing the save/restore > of spaces in the pasm file to use periods causes the pointer to be > 0x2e2e2e2e. I tried for a bit on this, but couldn't really track it down > any more than that. Hopefully someone else can figure it out. Two problems found so far: 1) mem_realloc passes the incorrect size to Parrot_allocate (this causes the specific error mentioned above) 2) add_header_to_free calls mem_realloc calls Parrot_allocate calls go_collect which moves the free header pool The first is a simple fix; the second needs either suppression of collection during the procedure, or not using mem_realloc. The patch below puts the reallocation code into add_header_to_free and add_pmc_to_free, this means that, in the scenario described, the free pool will be moved by the garbage collector, and then immediately moved again. Since the free pools are not compressed by go_collect, perhaps they should be allocated independently and not copied around all the time?? After fixing the above, the test program still abends, this time with "subend somehow is less than substart" - I have not yet followed up on this. -- Peter Gibbs EmKel Systems Index: memory.c === RCS file: /home/perlcvs/parrot/memory.c,v retrieving revision 1.29 diff -u -r1.29 memory.c --- memory.c 18 Mar 2002 16:42:06 - 1.29 +++ memory.c 29 Mar 2002 17:41:17 - @@ -145,7 +145,7 @@ { size_t copysize = (fromsize > tosize ? tosize : fromsize); void *mem; -mem = Parrot_allocate(interpreter, copysize); +mem = Parrot_allocate(interpreter, tosize); if (!mem) { return NULL; } Index: resources.c === RCS file: /home/perlcvs/parrot/resources.c,v retrieving revision 1.35 diff -u -r1.35 resources.c --- resources.c 26 Mar 2002 16:33:01 - 1.35 +++ resources.c 29 Mar 2002 17:42:15 - @@ -18,22 +18,26 @@ static void add_header_to_free(struct Parrot_Interp *interpreter, struct free_pool *pool, void *to_add); -/* Add a string header to the free string header pool */ +/* Add a PMC header to the free pool */ static void add_pmc_to_free(struct Parrot_Interp *interpreter, struct free_pool *pool, void *to_add) { PMC **temp_ptr; /* First, check and see if there's enough space in the free pool. If - we're within the size of a STRING pointer, we make it bigger */ + we're within the size of a pointer, we make it bigger */ if (pool->entries_in_pool * sizeof(PMC *) >= pool->pool_buffer.buflen - sizeof(PMC *)) { /* If not, make the free pool bigger. We enlarge it by 20% */ -pool->pool_buffer.bufstart = mem_realloc(interpreter, - pool->pool_buffer.bufstart, - pool->pool_buffer.buflen, - (UINTVAL)(pool->pool_buffer.buflen * 1.2)); -pool->pool_buffer.buflen = (UINTVAL)(pool->pool_buffer.buflen * 1.2); - +/* don't use mem_realloc, because garbage collection may occur */ +size_t new_size = pool->pool_buffer.buflen * 1.2; +void *new_ptr = Parrot_allocate(interpreter, new_size); +if (!new_ptr) { +internal_exception(ALLOCATION_ERROR, + "No room to expand free PMC pool"); +} +memcpy(new_ptr, pool->pool_buffer.bufstart, pool->pool_buffer.buflen); +pool->pool_buffer.bufstart = new_ptr; +pool->pool_buffer.buflen = new_size; } /* Okay, so there's space. Add the header on */ @@ -269,12 +273,16 @@ if (pool->entries_in_pool * sizeof(STRING *) >= pool->pool_buffer.buflen - sizeof(STRING *)) { /* If not, make the free pool bigger. We enlarge it by 20% */ -pool->pool_buffer.bufstart = mem_realloc(interpreter, - pool->pool_buffer.bufstart, - pool->pool_buffer.buflen, - (UINTVAL)(pool->pool_buffer.buflen * 1.2)); -pool->pool_buffer.buflen = (UINTVAL)(pool->pool_buffer.buflen * 1.2); - +/* don't use mem_realloc, because garbage collection may occur */ +size_t new_size = pool->pool_buffer.buflen * 1.2; +void *new_ptr = Parrot_allocate(interpreter, new_size); +if (!new_ptr) { +internal_exception(ALLOCATION_ERROR, + "No room to expand free buffer header pool"); +} +memcpy(new
Re: Bugfix release?
"Dan Sugalski" <[EMAIL PROTECTED]> wrote > With the recent stack and GC patches, are we pretty much solid now? > If so, a 0.0.5 bugfix release may well be in order. The one outstanding issue that I know of is the mem_realloc problem in add_pmc_to_free and add_header_to_free. Since the problem is actually endemic to mem_realloc, the best solution seems to be a replacement for mem_realloc that takes a Buffer*, so that an intervening GC run still leaves it knowing where the source is. In line with a previous suggestion, I propose that such a routine be called Parrot_reallocate. A patch to create such a function and use it within add_pmc_to_free and add_header_to_free follows; since all uses of mem_realloc are potentially at risk, it might be a good idea to change any such places also. -- Peter Gibbs EmKel Systems Index: include/parrot/resources.h === RCS file: /home/perlcvs/parrot/include/parrot/resources.h,v retrieving revision 1.25 diff -u -r1.25 resources.h --- include/parrot/resources.h 18 Mar 2002 22:09:10 - 1.25 +++ include/parrot/resources.h 30 Mar 2002 14:58:26 - @@ -31,6 +31,7 @@ void *Parrot_allocate(struct Parrot_Interp *, size_t size); void *Parrot_alloc_new_block(struct Parrot_Interp *, size_t, UINTVAL); +void Parrot_reallocate(struct Parrot_Interp *, Buffer *, size_t); void Parrot_new_pmc_header_arena(struct Parrot_Interp *interpreter); Index: resources.c === RCS file: /home/perlcvs/parrot/resources.c,v retrieving revision 1.37 diff -u -r1.37 resources.c --- resources.c 30 Mar 2002 05:57:32 - 1.37 +++ resources.c 30 Mar 2002 14:57:38 - @@ -18,22 +18,18 @@ static void add_header_to_free(struct Parrot_Interp *interpreter, struct free_pool *pool, void *to_add); -/* Add a string header to the free string header pool */ +/* Add a PMC header to the free pool */ static void add_pmc_to_free(struct Parrot_Interp *interpreter, struct free_pool *pool, void *to_add) { PMC **temp_ptr; /* First, check and see if there's enough space in the free pool. If - we're within the size of a STRING pointer, we make it bigger */ + we're within the size of a pointer, we make it bigger */ if (pool->entries_in_pool * sizeof(PMC *) >= pool->pool_buffer.buflen - sizeof(PMC *)) { /* If not, make the free pool bigger. We enlarge it by 20% */ -pool->pool_buffer.bufstart = mem_realloc(interpreter, - pool->pool_buffer.bufstart, - pool->pool_buffer.buflen, - (UINTVAL)(pool->pool_buffer.buflen * 1.2)); -pool->pool_buffer.buflen = (UINTVAL)(pool->pool_buffer.buflen * 1.2); - +size_t new_size = pool->pool_buffer.buflen * 1.2; +Parrot_reallocate(interpreter, &pool->pool_buffer, new_size); } /* Okay, so there's space. Add the header on */ @@ -273,12 +269,8 @@ if (pool->entries_in_pool * sizeof(STRING *) >= pool->pool_buffer.buflen - sizeof(STRING *)) { /* If not, make the free pool bigger. We enlarge it by 20% */ -pool->pool_buffer.bufstart = mem_realloc(interpreter, - pool->pool_buffer.bufstart, - pool->pool_buffer.buflen, - (UINTVAL)(pool->pool_buffer.buflen * 1.2)); -pool->pool_buffer.buflen = (UINTVAL)(pool->pool_buffer.buflen * 1.2); - +size_t new_size = pool->pool_buffer.buflen * 1.2; +Parrot_reallocate(interpreter, &pool->pool_buffer, new_size); } /* Okay, so there's space. Add the header on */ @@ -879,6 +871,25 @@ } } return (void *)return_val; +} + +/* Change the size of a buffer/string created by Parrot_allocate + Input parameter is a pointer to the Buffer or String structure + bufstart is updated to point to the new memory + buflen is updated to the new length + min(buflen,new_size) bytes are copied from old location to new location +*/ +void +Parrot_reallocate(struct Parrot_Interp *interpreter, Buffer *buffer, size_t new_size) { +size_t copysize = (buffer->buflen > new_size ? new_size : buffer->buflen); +void *new_ptr = Parrot_allocate(interpreter, new_size); +if (!new_ptr) { +internal_exception(ALLOCATION_ERROR, + "Out of memory during buffer reallocation"); +} +memcpy(new_ptr, buffer->bufstart, copysize); +buffer->bufstart = new_ptr; +buffer->buflen = new_size; } /* Tag a buffer header as alive. Used by the GC system when tracing
Re: PMCs requiring a 'set' dest register?
"Michel J Lambert" <[EMAIL PROTECTED]> wrote: > A second approach is to throw out this weird transmogrifying class stuff, > and just construct a new PMC of the appropriate type to put into the > destination register. Why would we *ever* care what's in the destination > register, since it never gets it's vtable methods called. We just go in Remember that the higher level (eg perl6) will expect to be able to find the PMC afterwards. For example: $foo = "abc"; $bar = \$foo; print $$bar; $foo = 3; print $$bar; Under perl 5, this will print abc3, as the reference stays intact following the re-assignment to $foo. Assuming that this will stay the same for perl 6, we must either keep the original PMC address, or find and correct all references to it. -- Peter Gibbs EmKel Systems
Re: PMCs requiring a 'set' dest register?
"Michel J Lambert" <[EMAIL PROTECTED]> wrote: >This depends upon how variables and the registers interact. If, like a >real CPU, registers must be stored back into traditional memory as >variables, then there is no problem. >If instead, registers are aliased onto traditional memory variables, such >that a PMC pointed to by a register is *also* pointed to by a stash >somewhere, then it's a bit harder. The issue here is whether PerlRef PMCs will reference a PMC or a symbol table/stash entry. I was assuming the former, in which case we have something like: #$foo = "abc"; new P0, PerlString set P0, "abc" stash_store P0, "foo" #$bar = \$foo; new P1, PerlRef stash_load P0, "foo" #Not needed if P0 unchanged set P1, P0 #Make P1 reference P0 (not 'foo') stash_store P1, "bar" #print $$bar deref P2, P1 #put the PMC pointed to by P1, into P2 print P2 #$foo = 3; set P0, 3 # no stash_store here because we have not created a new variable #print $$bar deref P2, P1 print P2 and we are relying on the address of P0 to remain constant. > STRING* s = string_copy(INTERP, (STRING*)SELF->cache.struct_val); > dest->cache.struct_val = > string_concat(INTERP, > s, > value->vtable->get_string(INTERP, value), > 0 > ); > /* don't destroy s, as it is dest->cache.struct_val */ > > What's the point of making a copy of that string? I don't really know, > and I don't see the point. The comment at the end doesn't help any, either > This code seems to be related to differences between documentation and implementation. Quote from docs/strings.pod: - To concatenate two strings - that is, to add the contents of string C to the end of string C, use: STRING* string_concat(STRING* a, STRING *b, INTVAL flag) C is updated, and is also returned as a convenience. The current implementation of string_concat always creates a new string; both a & b are unchanged. The code snippet you give seems to believe the documentation. Perhaps we need both two-argument and three-argument versions; this could be implemented as a three-argument function which detects 'out == in1' and acts accordingly. Note that string_repeat and string_substr are documented in the same place as only creating a new string if the final parameter is NULL; the implementations again differ. I agree that the whole issue of garbage collection relating to newborns and temporaries is problematic. One option might be to always flag newly-created objects in some way, then have a single function call to release them; at least this means just one function call at the end of each procedure. This still leaves a problem with nested procedures. -- Peter Gibbs EmKel Systems
COW strings
> > Implementing COW is a bit harder, although now that we have a DOD pass, a > lot easier. We can update counts in there...it's just not very easy to see > how we're going to keep track of refcounts. > I made two assumptions for my test implementation of COW strings: 1) we need to be able to share substrings as well as complete strings 2) COW must survive garbage collection Without these two, I believe the overhead probably outweighs the benefits However, these assumptions then led to a few implementation difficulties: 1) The string header has to reference both the start of the buffer and the start of the string i.e. we need to add either a second pointer or an offset. This impacts on all code using bufstart as the start of the string. 2) Some sort of flag or counter is required during garbage collection to ensure that a given buffer is only copied once. Since the current version of Parrot_allocate always makes the buffer larger than the requested size, I just used the byte after buflen. To make sure I always have space for the relocation address, I changed string_make to round the buffer size up to one less than the next multiple of 16; _allocate will then go one bigger, which will be the flag byte. [Incidentally, this makes sense anyway if any in-place string alterations occur, as the otherwise-wasted padding can be used to grow the string] (go_collect currently uses a different rounding rule, which incidentally makes zero-length buffers end up pointing to the same location as the next non-zero length buffer; for COW purposes I changed this to work the same as _allocate) The flag byte needs three values: a) Uncopied buffer (set when buffer is allocated) b) Copied once, not referenced again c) Copied once and referenced by at least one other header I was originally thinking of a separate pass to find buffers in state (b) and reset their (header's) COW flag; an alternative would be to store the address of the first header in the old buffer (need to ensure that a minimum-size buffer can hold two pointers as well as the flag byte!) and clear its COW flag regardless; if another reference to the same buffer is found, then set the COW flag in the first header. This removes the second pass at a slightly increased cost for the normal GC pass. Before Dan put his memory allocation / garbage collection system in place, COW gave a significant performance increase on string-intensive processing, due to the overheads of using the system memory allocator and to the ever-increasing memory utilisation. In my latest tests, COW seemed to make very little difference; however, I have not yet implemented the clearing of the COW flag as discussed above, so "once a cow, always a cow", which will have some negative impact. -- Peter Gibbs EmKel Systems
Definition of a null string?
Currently, calling string_make(interp, NULL,0,NULL,0,NULL), as is done by, for example "new P0, PerlString", calls Parrot_allocate with a required allocation of zero, which gets converted to 16; a 16-byte buffer is hence allocated, and its address placed in bufstart. However, buflen will still be zero, therefore the block will be discarded as soon as any actual content is assigned to the string. Parrot_go_collect uses a slightly different rounding rule, and will allocate zero bytes to a zero-length buffer; the current allocation address will be placed in bufstart, but the same address will be used for the next buffer collected. Would it be sensible to define 'buflen=0, bufused=0, bufstart=NULL' as a valid null string? This would impact on any code that assumes bufstart to be valid. If not, are there any other suggestions for ways to avoid this proliferation of useless blocks, or should we just let them be created and discarded? -- Peter Gibbs EmKel Systems
Re: COW strings
uite often, and is actually a worthwhile optimisation even without COWs. If either of the inputs to _concat is null, it already invokes string_copy and therefore COWs Strangely, string_chopn did not need to change, as it only changes the string header (bufused and strlen) and not the buffer itself (thanks to the earlier decision to remove null termination from parrot strings) I did not change string_repeat; this could be modified to do a simple string_copy if the repeat count was one, and to act in-place if possible Since then, string_replace has been added; for now, I will just change this to do a straight un-cow (i.e. make a copy of the buffer if it is marked as COW) > Mike Lambert > -- Peter Gibbs EmKel Systems
[PATCH] Re: Definition of a null string?
>> On Tuesday 02 April 2002 04:36, Peter Gibbs wrote: >"Bryan C. Warnock" <[EMAIL PROTECTED]> wrote: > buflen should be 16, not 0. bufused should be 0. Currently, string_make sets the value of buflen. It does not know about Parrot_allocate's rounding rule, so it uses the supplied size. The patch below changes string_make to set the allocation request (and hence buflen) to one less than the next multiple of 16; thus a zero allocation will end up with a buflen of 15. This looks to be the best that can be done simply. > > Parrot_go_collect uses a slightly different rounding rule, and will > allocate > > zero bytes to a zero-length buffer; the current allocation address will be > > placed in bufstart, but the same address will be used for the next buffer > > collected. > > They should probably round the same. The patch below changes the GC code to use the same rounding algorithm as Parrot_allocate i.e. round up to next multiple of 16. -- Peter Gibbs EmKel Systems Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.65 diff -u -r1.65 string.c --- string.c 30 Mar 2002 03:04:37 - 1.65 +++ string.c 2 Apr 2002 13:24:50 - @@ -47,13 +47,13 @@ } s = new_string_header(interpreter); -s->bufstart = Parrot_allocate(interpreter, buflen); +s->bufstart = Parrot_allocate(interpreter, buflen | 15); s->encoding = encoding; /* Make sure we maintain the flags we might already have on the * string header we just fetched */ s->flags |= flags; s->type = type; -s->buflen = buflen; +s->buflen = buflen | 15; if (buffer) { mem_sys_memcopy(s->bufstart, buffer, buflen); Index: resources.c === RCS file: /home/perlcvs/parrot/resources.c,v retrieving revision 1.39 diff -u -r1.39 resources.c --- resources.c 1 Apr 2002 04:35:14 - 1.39 +++ resources.c 2 Apr 2002 13:13:53 - @@ -727,10 +727,8 @@ interpreter->arena_base->string_header_pool->pool_buffer.buflen); interpreter->arena_base->string_header_pool->pool_buffer.bufstart = cur_spot; cur_size = interpreter->arena_base->string_header_pool->pool_buffer.buflen; - if (cur_size & 0x0f) { -cur_size &= ~0x0f; -cur_size += 16; - } + cur_size += 16; + cur_size &= ~0x0f; cur_spot += cur_size; /* Collect the PMC header pool */ @@ -739,10 +737,8 @@ interpreter->arena_base->pmc_pool->pool_buffer.buflen); interpreter->arena_base->pmc_pool->pool_buffer.bufstart = cur_spot; cur_size = interpreter->arena_base->pmc_pool->pool_buffer.buflen; - if (cur_size & 0x0f) { -cur_size &= ~0x0f; -cur_size += 16; - } + cur_size += 16; + cur_size &= ~0x0f; cur_spot += cur_size; /* And the buffer header pool */ @@ -751,10 +747,8 @@ interpreter->arena_base->buffer_header_pool->pool_buffer.buflen); interpreter->arena_base->buffer_header_pool->pool_buffer.bufstart = cur_spot; cur_size = interpreter->arena_base->buffer_header_pool->pool_buffer.buflen; - if (cur_size & 0x0f) { -cur_size &= ~0x0f; -cur_size += 16; - } + cur_size += 16; + cur_size &= ~0x0f; cur_spot += cur_size; /* Run through all the buffer header pools and copy */ @@ -772,10 +766,8 @@ string_array[i].buflen); string_array[i].bufstart = cur_spot; cur_size = string_array[i].buflen; -if (cur_size & 0x0f) { - cur_size &= ~0x0f; - cur_size += 16; -} +cur_size += 16; +cur_size &= ~0x0f; cur_spot += cur_size; } }
Re: [PATCH] Re: Definition of a null string?
On 03 April 2002 05:09 Bryan C. Warnock wrote: > We should probably have Parrot_allocate (and realloc) actually pass back > how much it *really* allocated, as opposed to having to guess and/or > reimplement the rounding algorithm everywhere. > > Either > >buflen = Parrot_allocate(interpreter, requested_size, &bufstart); > > or > >bufstart = Parrot_allocate(interpreter, requested_size, &buflen); > > or something else similar. My first thoughts on this are that we should go the whole way, and have Parrot_allocate take a Buffer* and a requested size, and let it fill in the bufstart and buflen parameters (as in the not-yet-implemented Parrot_reallocate patch). This would cause problems for anybody trying to use Parrot_allocate to create temporary storage; however, that is a highly dangerous thing to do anyway without taking the possible impact of GC into consideration. The only place I can find a call to Parrot_allocate that does not place the resulting pointer into a buffer is in the code for 'open' in io.ops. This code is interesting anyway, because it will fail if the second allocate attempt triggers a collection; since this is only for a mode string, the chances are very small, but not zero. Why are we passing a C-style string around inside parrot anyway? -- Peter Gibbs EmKel Systems
Re: [PATCH] Parrot_(re)allocate_buffer
> this would make mem_realloc and Parrot_reallocate_buffer a little bit more > difficult, since they'd have to allocate a temporary buffer on the stack I had thought of moving the current functionality of Parrot_allocate into an internal-use-only function (allocate_buffer or some such), which would then be called by both Parrot_allocate and Parrot_reallocate. The alignment/padding rules would then be implemented in _allocate and _reallocate, but nowhere else. Of course, putting the 'correct' buflen into the string header messes up my evil use of the last byte for COWs! -- Peter Gibbs EmKel Systems
Re: Worst-case GC Behavior?
> Since we're never freeing any memory, it continually is allocating a block > of size 56 (memory pool) + 1 (character) from the underlying system api. Note that Parrot_alloc_new_block will allocate a minimum of DEFAULT_SIZE (set in config.h; currently 32768), so this situation is not quite as bad as it first appears. > Each time through the array, it has to alloc a PMC header. When we > allocate the header, we store it into P0, and the old header is > essentially freed. > > The next time through the loop, entries_in_pool is 0, and it triggers > alloc_more_string_Headers, and a dod run. This finds the PMC we just > freed, and uses it. Repeat. Each time through the loop, it triggers a dod > run. As another example, doing a 5000-generation run of life.pasm ends up on my system with numbers like: 5000 generations in 91.883605 seconds. 54.416672 generations/sec A total of 32768 bytes were allocated A total of 130932 DOD runs were made A total of 10930 collection runs were made Copying a total of 0 bytes There are 81 active Buffer structs There are 256 total Buffer structs In other words, each generation is causing an average of 26 DOD runs. This code does not use PMCs, but string headers exhibit the same problem. Clearing S0 to S14 to zero after the initialisation code, which makes 15 additional free string headers available for the rest of the program, reduces the number of DOD runs by about 15000. One option might be a threshold - if, after the DOD run, there is still less than N headers available, allocate more even though we can satisfy the immediate requirement. This would improve performance by reducing the number of DOD runs, but at the cost of additional memory - a classic tradeoff! -- Peter Gibbs EmKel Systems
Re: COW for strings
I don't think we are in a position yet to prove much of anything as regards real-world Perl programs, but just one data point as an example - using examples/assembly/life.pasm (changed to 5000 generations) [Pentium 166MHz; linux 2.2.18] Clean CVS checkout (time averaged over 3 runs) 5000 generations in 91.883605 seconds. 54.416672 generations/sec A total of 32768 bytes were allocated A total of 130932 DOD runs were made A total of 10930 collection runs were made Copying a total of 0 bytes There are 81 active Buffer structs There are 256 total Buffer structs Full string COW (time averaged over 3 runs) - 5000 generations in 82.940713 seconds. 60.284025 generations/sec A total of 32768 bytes were allocated A total of 130932 DOD runs were made A total of 2412 collection runs were made Copying a total of 0 bytes There are 81 active Buffer structs There are 256 total Buffer structs -- Peter Gibbs EmKel Systems
Minor patch to make the GC count total bytes copied
The memory_collected GC statistic does not get updated at present. Patch below fixes. Note that a 5000-generation run of life.pasm allocates 32K, and copies almost 58MB. -- Peter Gibbs EmKel Systems Index: resources.c === RCS file: /home/perlcvs/parrot/resources.c,v retrieving revision 1.40 diff -u -r1.40 resources.c --- resources.c 9 Apr 2002 03:49:48 - 1.40 +++ resources.c 11 Apr 2002 15:02:49 - @@ -787,6 +787,7 @@ /* How much is free. That's the total size minus the amount we used */ new_block->free = new_block->size - (new_block->top - new_block->start); + interpreter->memory_collected += (new_block->top - new_block->start); /* Now we're done. Put us as the only block on the free list and free the rest */
Re: [PATCH] Disable GC at startup
Michel J Lambert wrote: > Right. However, that's not to say that memory cannot grow. The interpreter > allocates the various *_pools from the interpreter's memory_pool, and it > gets copied with each collection run. This memory can grow and change in > size as more memory for pools are needed. Currently, we use the GC's > buffer system for tha. Alternately, we could use mem_sys_allocate, and > manually free and allocate memory for them. The latter would then allow > these pools to not be copied during collection runs, and would eliminate > that particular element of unsafety. The current design never shrinks the free header pools, and indeed there is probably little point in doing so, so nothing seems to be gained from including them in the collection process. Using my favourite 5000-generation life.pasm as an example: A total of 10930 collection runs were made Copying a total of 57801936 bytes Since the three pools occupy 1024 bytes each, 33576960 bytes of the above copying is due to the pools. -- Peter Gibbs EmKel Systems - Original Message - From: "Michel J Lambert" <[EMAIL PROTECTED]> To: "Bryan C. Warnock" <[EMAIL PROTECTED]> Cc: "Dan Sugalski" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: 12 April 2002 09:22 Subject: Re: [PATCH] Disable GC at startup > > > So you're saying that the calls to get memory during interpreter > > > initialization are somehow guaranteed to not require more memory (and thus > > > a dod or collection run)? Currently, this guarantee is not expressed in > > > > I don't understand the "thus." Nothing states that requesting memory > > mandates a DOD/GC run. > > If I call Parrot_allocate, it checks to see if it has enough allocated > memory to provide the request. If not, it performs a collection run, and > tries again. If not, it tries allocationg a new block of memory, and > returns that. If that fails, it gives up. > > So if our initial allocation size for some reason isn't enough for all of > the parrot initializations, it will cause a collection run. This in turn > depends that the various *_pools were correctly initialized. If this is > not the case (say, we're allocating memory for the pools in the first > place), bad things happen. > > Likewise, the construction of a new PMC for the stash_hash could cause a > DOD to occur if there aren't enough PMC headers available. Granted one > won't cause this, or even a few. But the possibility exists, if say the > stash_hash PMC initializes a bunch of default PMCs for its use, or > somesuch. > > > > code, comments, or documentation, from what I can tell. And I personally > > > hate implicit assumptions like these...they're the cause of all that magic > > > voodo, far away from where it's obvious what the problem is. We have no > > > idea of knowing how much is 'enough' for the GC to allocate at startup, > > > aside from the following the code and tracking allocations, which can be > > > a pain. > > > > I guess it depends on what you consider to be initialization. Little, if > > any, of what the interpreter itself demands in memory - including memory for > > the GC itself - is going to be reclaimable; it's all mostly persistent > > throughout the life of the interpreter. > > Right. However, that's not to say that memory cannot grow. The interpreter > allocates the various *_pools from the interpreter's memory_pool, and it > gets copied with each collection run. This memory can grow and change in > size as more memory for pools are needed. Currently, we use the GC's > buffer system for tha. Alternately, we could use mem_sys_allocate, and > manually free and allocate memory for them. The latter would then allow > these pools to not be copied during collection runs, and would eliminate > that particular element of unsafety. > > > If there's not enough memory to bootstrap the interpreter and the GC, then > > even if the GC *could* reclaim enough to satisfy subsequent requests, > > there's not going to be nearly enough memory left to do anything. So why > > bother? > > Because if there's not enough memory, it can still allocate *more* memory. > It just tries to reclaim enough memory before resorting to the 'extreme' > of getting more system memory. I hope that makes things clear. > > > (Just) enough of the interpreter needs to start up to start up the memory > > allocator and the GC subsystem. If it needs more memory to do so, it should > > simply be given more memory. After the GC is up and running, the > > interpreter/bootstrapping process can trigger a GC run to free up as much > > memory as it can. The remainder of the interpreter can then start up, > > running through the GC. > > You're, right, it should be given more memory to do so. But since it > performs a collection run before getting more memory, and since it's not > safe to perform a collection run during initialization, bad things happen. > > Mike Lambert >
[PATCH] string_grow doesn't allocate new buffer if nothing to copy
The string_grow function (currently used only by string_replace) does not allocate a new buffer if there are no bytes to be copied from old buffer to new buffer. Patch below fixes this. -- Peter Gibbs EmKel Systems Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.68 diff -u -r1.68 string.c --- string.c 12 Apr 2002 01:40:28 - 1.68 +++ string.c 12 Apr 2002 08:20:54 - @@ -76,11 +76,11 @@ INTVAL copysize = s->bufused; if(addlen < 0) copysize += addlen; -if(copysize <= 0) -return s; /* Don't check buflen, if we are here, we already checked. */ newbuf = Parrot_allocate(interpreter, s->buflen + addlen); -mem_sys_memcopy(newbuf, s->bufstart, (UINTVAL)copysize); +if (copysize > 0) { +mem_sys_memcopy(newbuf, s->bufstart, (UINTVAL)copysize); +} s->bufstart = newbuf; s->buflen += addlen; return s;
Re: [PATCH] Disable GC at startup
"Michel J Lambert" wrote: > Below is a patch which allocates the pools from system memory. > Unfortunately, it doesn't seem to provide any noticeable speed gains. I > get anywhere from a -1 to 15 extra generations per second. Current results Even though we copy large amounts of memory around, very little time seems to be used in the process. My current test version, with COW strings and a few other tweaks, achieves: 5000 generations in 80.735562 seconds. 61.930578 generations/sec A total of 32768 bytes were allocated A total of 117851 DOD runs were made A total of 333 collection runs were made Copying a total of 1347808 bytes compared to a current CVS version of: 5000 generations in 90.813806 seconds. 55.057708 generations/sec A total of 32768 bytes were allocated A total of 130932 DOD runs were made A total of 10930 collection runs were made Copying a total of 57801936 bytes so a 97% decrease in the number of collection runs only turns into an 11% improvement in total performance. [166MHz Pentium; linux 2.2.18] -- Peter Gibbs EmKel Systems
Re: Profiling Parrot
Dan Sugalski wrote: > I think perhaps a rewrite of life.pasm into perl with some > benchmarking would be in order before making that judgement. Following is a rough perl5 version of life.pasm. On my system [Pentium 166; linux 2.2.18; perl 5.6.1] this takes 96 to 97 seconds; CVS parrot takes 91 to 92 seconds (without JIT) So, thus far, we aren't doing too badly. I am sure the perl version can be optimised, but as it stands I think it is a reasonable equivalent of what the pasm version does. -- Peter Gibbs EmKel Systems #!/usr/bin/perl # # life.perl # # Play conway's (no, not *him*. The other conway) game # of life # # First the generation count $I2 = 5000; # Note the time (for perl, this is seconds only) $N5 = time; # If true, we don't print $I12 = 1; $S15 = " " . " " . " *" . " ** " . " * * " . " " . " * " . " * " . " * " . " * " . " * " . " * " . " " . " " . " "; dump_life(); $I0 = 0; while ($I0 < $I2) { $I0++; generate(); dump_life(); } $N6 = time; $N7 = $N6 - $N5; print "$I2 generations in $N7 seconds. "; $N8 = $I2; $N1 = $N8 / $N7; print "$N1 generations/sec\n"; #GC statistics deleted for perl version # S15 has the incoming string, S0 is scratch, S1 is scratch, S2 is scratch # # I0 is the length of the string # I1 is the current cell we're checking # I2 is the count for that cell # I3 is the offset to the neighbor sub generate { my ($I0, $I1, $I2, $I3); $I0 = length($S15); $S1 = ""; for ($I1 = 0; $I1 < $I0; $I1++) { $I2 = 0; $I3 = (-16 + $I0 + $I1) % $I0; $I2++ if (substr($S15, $I3, 1) eq "*"); $I3 = (-15 + $I0 + $I1) % $I0; $I2++ if (substr($S15, $I3, 1) eq "*"); $I3 = (-14 + $I0 + $I1) % $I0; $I2++ if (substr($S15, $I3, 1) eq "*"); $I3 = (-1 + $I0 + $I1) % $I0; $I2++ if (substr($S15, $I3, 1) eq "*"); $I3 = (1 + $I0 + $I1) % $I0; $I2++ if (substr($S15, $I3, 1) eq "*"); $I3 = (14 + $I0 + $I1) % $I0; $I2++ if (substr($S15, $I3, 1) eq "*"); $I3 = (15 + $I0 + $I1) % $I0; $I2++ if (substr($S15, $I3, 1) eq "*"); $I3 = (16 + $I0 + $I1) % $I0; $I2++ if (substr($S15, $I3, 1) eq "*"); if (substr($S15, $I1, 1) eq "*") { if ($I2 < 2 or $I2 > 3) { $S1 .= " "; } else { $S1 .= "*"; } } else { if ($I2 == 3) { $S1 .= "*"; } else { $S1 .= " "; } } } $S15 = $S1; } # S15 has the incoming string, S0 is scratch sub dump_life { return if ($I12 > 0); print "\f\n\n\n\n\n\n\n\n\n\n\n"; print "- generation $I0 -\n"; my ($I0, $I1); for ($I0 = 0, $I1 = 14; $I1 >= 0; $I1--, $I0 += 15) { $S0 = substr($S15, $I0, 15); print "$S0\n"; } #sleep 1; }
Re: Is there a way to turn GC completely off?
Clinton A. Pierce wrote: > I'm fighting a now-you-see-it now-you-don't kind of bug and I was wondering > if there's a way to completely turn off garbage collection and memory > re-use for debugging? The defined procedures for doing this are not currently implemented. The simplest way for now is probably just to add return statements at the beginning of Parrot_do_dod_run and/or Parrot_go_collect in resources.c -- Peter Gibbs EmKel Systems
[PATCH] Re: Is there a way to turn GC completely off?
The specific problem Clinton mentioned is yet another infant mortality problem, this time in string_concat. I don't know what the current decision is on handling these situations, but this one can be avoided by optimising the code anyway. If the transcoding is done before making the result string, we know the actual length, instead of using the maximum possible. I haven't yet looked to see why basic needs to transcode string anyway. -- Peter Gibbs EmKel Systems Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.68 diff -u -r1.68 string.c --- string.c12 Apr 2002 01:40:28 - 1.68 +++ string.c14 Apr 2002 18:45:54 - @@ -283,14 +283,14 @@ if (a != NULL && a->strlen != 0) { if (b != NULL && b->strlen != 0) { -result = string_make(interpreter, NULL, a->bufused + - b->strlen * a->encoding->max_bytes, - a->encoding, 0, a->type); -mem_sys_memcopy(result->bufstart, a->bufstart, a->bufused); +/* transcode first so we know the length and avoid infanticide */ if (a->type != b->type || a->encoding != b->encoding) { b = string_transcode(interpreter, b, a->encoding, a->type, NULL); } +result = string_make(interpreter, NULL, a->bufused + b->bufused, + a->encoding, 0, a->type); +mem_sys_memcopy(result->bufstart, a->bufstart, a->bufused); mem_sys_memcopy((void *)((ptrcast_t)result->bufstart + a->bufused), b->bufstart, b->bufused); result->strlen = a->strlen + b->strlen;
Re: Is there a way to turn GC completely off?
[Follow up to my previous post] lib/Parrot/Assembler.pm creates all string constants with chartype = 0, which, by virtue of the sequence of the enums in include/parrot/chartype.h means unicode. So all string constants are type: unicode, encoding: singlebyte. I am not going to try and fix this one, but it does mean that the string transcoding routines are getting called when nobody expects it. There may well be other places in string.c where this causes problems; I will look at this shortly. -- Peter Gibbs EmKel Systems
Re: Resync time... [APPLIED]
Note that string_grow still has the problem with not bothering to allocate a new buffer if copysize is zero, e.g. if we are expanding a previously empty buffer. I have submitted a patch for this previously, but since string_grow has now changed, herewith a resynced patch. -- Peter Gibbs EmKel Systems Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.72 diff -u -r1.72 string.c --- string.c 15 Apr 2002 20:04:03 - 1.72 +++ string.c 15 Apr 2002 20:18:33 - @@ -72,11 +72,6 @@ */ STRING * string_grow(struct Parrot_Interp * interpreter, STRING * s, INTVAL addlen) { -INTVAL copysize = s->bufused; -if(addlen < 0) -copysize += addlen; -if(copysize <= 0) -return s; /* Don't check buflen, if we are here, we already checked. */ Parrot_reallocate(interpreter, s, s->buflen + addlen); return s;
Re: Resync time... [APPLIED] [APPLIED again]
Dan Sugalski wrote: > This has been applied too. There's something bugging me about it--I > think there may be issues, and I'd really like it if we made sure we > had a bunch of zero-length string tests in the test suite. The only issue I am currently aware of with zero-length strings is the fact that mem_allocate will allocate 16 bytes during creation, but go_collect will allocate zero space, and hence we end up with multiple buffer headers pointing to the same physical memory address (see previous thread "Definition of a null string?" http://www.mail-archive.com/perl6-internals@perl.org/msg09000.html) This does not seem to cause any problems, since buflen is zero, so the buffer will never be accessed; however it looks very strange when debugging the GC. Attached is a part of my COW patch, which relies on the 'round up to next multiple of 16' behaviour of Parrot_allocate, and makes the additional allocated space available for subsequent use for string expansion. -- Peter Gibbs EmKel Systems Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.73 diff -u -r1.73 string.c --- string.c 15 Apr 2002 20:34:28 - 1.73 +++ string.c 15 Apr 2002 21:02:38 - @@ -47,13 +47,12 @@ } s = new_string_header(interpreter); -Parrot_allocate(interpreter, s, buflen); +Parrot_allocate(interpreter, s, buflen | 15); s->encoding = encoding; /* Make sure we maintain the flags we might already have on the * string header we just fetched */ s->flags |= flags; s->type = type; -s->buflen = buflen; if (buffer) { mem_sys_memcopy(s->bufstart, buffer, buflen); @@ -73,7 +72,7 @@ STRING * string_grow(struct Parrot_Interp * interpreter, STRING * s, INTVAL addlen) { /* Don't check buflen, if we are here, we already checked. */ -Parrot_reallocate(interpreter, s, s->buflen + addlen); +Parrot_reallocate(interpreter, s, (s->buflen + addlen) | 15); return s; }
Re: Resync time... [APPLIED] [APPLIED again]
> Is this still needed with the update to Parrot_allocate to use > buffers? I think I took care of that. No, Parrot_allocate still sets buflen to the original requested 'size'; not the inflated 'req_size' - I'm not sure what you intended. The behaviour of Parrot_allocate needs to clearly defined somewhere, as the COW code depends on it - it needs to allocate a flag byte within the buffer but outside the used space; my current code uses the last byte, and therefore keeps buflen one shorter than the physical size. The alternative is for buflen to always be the real allocated size (which means the size adjustment logic in go_collect can be removed); then the COW string handling routines would have to ensure that buflen-1 is always used as the available buffer size, which is not a major problem. -- Peter Gibbs EmKel Systems
Re: Resync time... [APPLIED] [APPLIED again]
Dan Sugalski wrote: > This feels excessively hackish. It's OK if we were welding in COW to > an existing code base, but we're not--if we put it in, we put it in > right. > > Right, in this case, is using the COW flag bit in the Buffer > structure and making the changes to the appropriate string_* > functions to use it, I think. The COW flag bit in the string header is used by the string functions; however, for GC purposes, I need a flag within the buffer as well, to identify buffers which have already been copied, since multiple string headers could point to the same physical buffer. That is the best I can do, so perhaps we should leave COW for somebody else to implement later. For interest, I am attaching an extract from my version of resources.c (not in diff format, because I have not completed syncing yet - things keep changing!) strstart is a void* field in STRING, pointing to the start of the string of interest in the buffer; this could be omitted if substrings were not COWed The last byte of the allocated buffer is always set to zero by string_make and other string creators When a buffer is copied, the address of the first STRING that referenced it is stored in the (old copy of the) buffer, and the in-buffer flag is set; if the same buffer is encountered later, the second header is simply adjusted to point to the new location. -- Peter Gibbs EmKel Systems /* Run through all the string header pools and copy */ for (cur_string_arena = interpreter->arena_base->last_STRING_Arena; NULL != cur_string_arena; cur_string_arena = cur_string_arena->prev) { UINTVAL i; STRING *string_array = cur_string_arena->start_STRING; for (i = 0; i < cur_string_arena->used; i++) { /* Is the string live, and can we move it? */ if (string_array[i].flags & BUFFER_live_FLAG && !(string_array[i].flags & BUFFER_immobile_FLAG) && string_array[i].bufstart) { UINTVAL offset = (char *)string_array[i].strstart - (char *)string_array[i].bufstart; if (string_array[i].flags & BUFFER_COW_FLAG && ((char*)(string_array[i].bufstart))[string_array[i].buflen]) { /* buffer has already been copied; just change the header */ STRING* hdr = *(STRING **)(string_array[i].bufstart); hdr->flags |= BUFFER_COW_FLAG; string_array[i].bufstart = hdr->bufstart; string_array[i].strstart = (char *)(string_array[i].bufstart) + offse t; } else { memcpy(cur_spot, string_array[i].bufstart, string_array[i].buflen+1); if (string_array[i].flags & BUFFER_COW_FLAG) { /* store new address and set 'buffer already copied' flag */ *(STRING **)(string_array[i].bufstart) = &string_array[i]; string_array[i].flags &= ~BUFFER_COW_FLAG; ((char*)(string_array[i].bufstart))[string_array[i].buflen] = 1; } string_array[i].bufstart = cur_spot; string_array[i].strstart = (char *)(string_array[i].bufstart) + offse t; cur_size = string_array[i].buflen; cur_size += 16; cur_size &= ~0x0f; cur_spot += cur_size; } } } }
Re: Resync time...
Simon Glover wrote: > Well, one issue with this patch is that Parrot will now segfault if > (s>buflen + addlen) < 0. It doesn't seem possible to actually provoke > this behaviour at the moment, however - string_grow is only called > from one place in string_replace, and the code in string_replace > ensures that addlen > 0. > > It's easy enough to add a test in string_grow to avoid the problem, > but I'm not sure what we should return in that case: the input > string s, or a newly created empty string? Dan has since rewritten string_grow to use Parrot_reallocate; however, the problem still remains. In this particular case, based on the name, description and usage of the function, there seems to be no reason not to change addlen to be a UINTVAL. However, this raises a more general point regarding preconditions. At one point do we start/stop trusting ourselves? string_replace calls string_grow which calls Parrot_reallocate which calls mem_allocate. If we say that string_replace (or other string_* functions) cannot be trusted to call string_grow correctly, surely we must then modify mem_allocate to handle a negative request length. -- Peter Gibbs EmKel Systems
Re: [netlabs #522] BASIC hangs and crashes, Win32 MSVC++, 0.0.5
Mike Lambert wrote: > Undoing the patch in resources.c seems to fix the problem. > > Changing: > ((Buffer *)buffer)->buflen = req_size; > to: > ((Buffer *)buffer)->buflen = size; > makes it work again. Just for interest, the problem here is that the rounding is always up to the next multiple of 16. So, for example, a zero-length string would have buflen set to 16 (actually it is set back to zero in string_make, but that just slows the process down slightly); string_copy would ask for a buffer of 16 and get back a buffer of 32, etc, so every time a string is copied, it grows by 16 bytes. This effect is exacerbated by the fact that "set S1, S2" does a string_copy - I am still not sure what is supposed to happen here; I believe that the pure set opcode should just be doing a register copy?? There is a clone opcode which also does a string_copy, which seems reasonable. The only advantage of having buflen always be the actual allocated length would be to remove the rounding logic in go_collect; note that go_collect rounds up only if the length is not already a multiple of 16; this difference in algorithm was where the whole discussion started in the first place. I suggested an increased buflen value specifically for strings, to allow mutable string operations to use the space which has already been allocated anyway. For example, the recently-added 'string_to_cstring' function will always reallocate a buffer the first time it is called for any given string, even though the physical allocation would have been sufficient. -- Peter Gibbs EmKel Systems
String mortality
Two more problems found in string.c; these relate to the creation of temporary strings to hold results of transcoding, in string_concat and string_compare. As per the latest (I think) decision from Dan ("Avoiding the deadlands", 9th April: http://www.mail-archive.com/perl6-internals@perl.org/msg09072.html), the following patch does the following: 1) Add BUFFER_neonate_FLAG (actually renamed BUFFER_needs_GC_FLAG, since this is not used at present, and can always be added again if it is needed in the future) - feel free to change the name to anything you fancy 2) Add neonate counters to interpreter structure (I added three separate counters; as it is really only required as a flag to indicate that cleanup is needed, one would probably suffice) 3) Change GC routines to treat 'neonate' string/buffer headers in the same way as constants 4) Change string_concat and string_compare to set and clear the 'neonate' flag as required Still required as per the above-referenced decision: 1) Implement equivalent flag for PMCs (unless the 'immune' flag serves the same purpose?) 2) Procedure to clear neonate flag on all headers from time to time Note that this patch gives compiler warnings in string.c because of the 'const' attribute on the parameters, and therefore should not be applied in its current form; I'm sure somebody can figure out how best to resolve the warnings. -- Peter Gibbs EmKel Systems Index: include/parrot/interpreter.h === RCS file: /home/perlcvs/parrot/include/parrot/interpreter.h,v retrieving revision 1.40 diff -u -r1.40 interpreter.h --- include/parrot/interpreter.h 3 Apr 2002 04:01:41 - 1.40 +++ include/parrot/interpreter.h 22 Apr 2002 12:58:47 - @@ -142,6 +142,9 @@ requests are there? */ UINTVAL GC_block_level; /* How many outstanding GC block requests are there? */ +UINTVAL neonate_strings;/* How many protected newborn strings ? */ +UINTVAL neonate_buffers;/* How many protected newborn buffers ? */ +UINTVAL neonate_PMCs; /* How many protected newborn PMCs ? */ } Interp; #define PCONST(i) PF_CONST(interpreter->code, (i)) Index: interpreter.c === RCS file: /home/perlcvs/parrot/interpreter.c,v retrieving revision 1.84 diff -u -r1.84 interpreter.c --- interpreter.c 15 Apr 2002 18:05:18 - 1.84 +++ interpreter.c 22 Apr 2002 13:05:56 - @@ -497,6 +497,9 @@ interpreter->memory_collected = 0; interpreter->DOD_block_level = 1; interpreter->GC_block_level = 1; +interpreter->neonate_strings = 0; +interpreter->neonate_buffers = 0; +interpreter->neonate_PMCs = 0; /* Set up the memory allocation system */ mem_setup_allocator(interpreter); Index: include/parrot/string.h === RCS file: /home/perlcvs/parrot/include/parrot/string.h,v retrieving revision 1.35 diff -u -r1.35 string.h --- include/parrot/string.h 24 Mar 2002 22:30:06 - 1.35 +++ include/parrot/string.h 22 Apr 2002 12:59:23 - @@ -65,8 +65,8 @@ /* Private flag for the GC system. Set if the buffer's in use as * far as the GC's concerned */ BUFFER_live_FLAG = 1 << 12, -/* Mark the bufffer as needing GC */ -BUFFER_needs_GC_FLAG = 1 << 13, +/* Mark the bufffer as newborn, for protection from infant death */ +BUFFER_neonate_FLAG = 1 << 13, /* Mark the buffer as on the free list */ BUFFER_on_free_list_FLAG = 1 << 14, /* This is a constant--don't kill it! */ Index: resources.c === RCS file: /home/perlcvs/parrot/resources.c,v retrieving revision 1.45 diff -u -r1.45 resources.c --- resources.c 19 Apr 2002 01:33:56 - 1.45 +++ resources.c 22 Apr 2002 13:00:47 - @@ -341,7 +314,8 @@ STRING *string_array = cur_string_arena->start_STRING; for (i = 0; i < cur_string_arena->used; i++) { /* Tentatively unused, unless it's a constant */ - if (!(string_array[i].flags & BUFFER_constant_FLAG)) { + if (!(string_array[i].flags & +(BUFFER_constant_FLAG | BUFFER_neonate_FLAG))) { string_array[i].flags &= ~BUFFER_live_FLAG; } } @@ -353,7 +327,8 @@ Buffer *buffer_array = cur_buffer_arena->start_Buffer; for (i = 0; i < cur_buffer_arena->used; i++) { /* Tentatively unused, unless it's a constant */ - if (!(buffer_array[i].flags & BUFFER_constant_FLAG)) { + if (!(buffer_array[i].flags & +(BUFFER_constant_FLAG | BUFFER_neonate_FLAG))) { buffer_array[i].flags &= ~BUFFER_live_FLAG; } } Index: string.c ==
Re: COW Revisted?
> The data which needs to be stored along with the buffer data, can be > stored as either a header or a footer. The size of this header needs to be > a multiple of 16 (or possibly even 8) bytes, so that the real buffer > which follows would be correctly aligned. I'm not sure if this applies for > a footer. > A footer might allow us to to tack 5-8 bytes on to the end of an > allocation, which might not always go 'over' the 16 byte rounding-up limit > we currently have in place. As long as we're not just-below the 16 byte > barrier on most of our allocations, this shouldn't waste any more memory. My current code (see http://www.mail-archive.com/perl6-internals@perl.org/msg09196.html) requires one flag byte within the buffer itself, provided the buffer is always guaranteed to be large enough to hold a pointer, which the current 16-byte allocation scheme ensures. To implement substrings requires an additional pointer (or offset) in the string header. I was assuming that we only want to implement COW for strings, as non-string buffers are generally speaking not under our control. I still think that the cheapest implementation for the flag byte is a footer, as we don't need to worry about alignment. This is actually a zero-cost option as far as memory allocation is concerned, as the current allocation scheme always allocates at least one extra byte. Variable-sized string headers are not really an option; if the overhead of an additional pointer is a problem, my first inclination would be to combine the chartype and encoding pointers into a single vtable entry. Dan's recent addition of (re)allocate_about conflicts with my current footer implementation, so I do not have a synced-up version of my COW code at present. These functions have also re-introduced the string growth problem (see http://www.mail-archive.com/perl6-internals@perl.org/msg09322.html); the patch and sample program below can be used to illustrate this. -- Peter Gibbs EmKel Systems Index: core.ops === RCS file: /home/perlcvs/parrot/core.ops,v retrieving revision 1.130 diff -u -r1.130 core.ops --- core.ops 24 Apr 2002 20:31:39 - 1.130 +++ core.ops 28 Apr 2002 10:59:35 - @@ -2366,6 +2366,18 @@ =cut +=item B(in STR) + +Print string header information for debugging purposes. + +=cut + +inline op debug(in STR) { + printf("%p -> %p; buflen %lu, bufused %lu, flags %lx\n", + $1, $1->bufstart, $1->buflen, $1->bufused, $1->flags); + goto NEXT(); +} + ### set S0, "test" set S1, S0 debug S1 clone S2, S1 clone S1, S2 clone S2, S1 clone S1, S2 clone S2, S1 clone S1, S2 debug S1 end
Re: COW Revisted?
Dan Sugalski wrote: > So, let's do this: > > 1) We'll add allocate_string and reallocate_string functions, which > the strings use. It'll give us COW space at the end of the string > data. > > 2) We'll add in new_*_const_header to match the new_*_header > functions, to allocate String/Buffer/PMC headers from constant header > arenas rather than from the default arenas > > 3) We'll add in (re)?allocate_const functions to allocate memory from > constant pools rather than the default collectable pools. > > 4) We'll add in COW functionality for strings and see what sort of > win we get. (I'm not sure that we'll win much with general COW, since > my gut feeling is that most COW strings will have a constant string > as their source, but I could be wrong here. That'd be OK) I'll make a start on this on Monday, but it may take a few days to get it all in place. I will probably post it as a series of patches, rather than trying to do (break?) it all at once. Does anybody have any thoughts on adding a new function in string.c to return the pointer to the start of the string, so all access to the innards of a STRING struct can be restricted to string.c? This would mean the logic to access a COWed substring would be confined to one place. After the introduction of string_to_cstring, there are very few places that access s->bufstart directly anyway. -- Peter Gibbs EmKel Systems
First patch to memory allocation routines
Herewith the first set of patches to the memory allocation routines. There is no new functionality here yet; basically I have been working on trying to remove some of the code that is duplicated between the various pools, before even more copies get made for the new stuff. The result is some fairly major changes, but I believe it will make adding the new features easier. I have tested the end product using the standard test suite, 5000 generations of life, loading Clint's basic wumpus, and reversing resources.c using Melvin's reverse.cola. We don't have much in the way of PMC-intensive test code at present. A summary of the changes: struct free_pool has new member 'replenish'; this is a function to be called when the pool is empty Initialisation of the free pools moved from memory.c to resources.c - mainly to avoid exporting more functions from resources.c New function new_free_pool to create a free pool - just to stop doing the same thing several times New function add_to_free_pool - replaces existing add_pmc_to_free and add_header_to_free New function get_from_free_pool - called by the new_*_header functions to avoid duplicating the code involved in extracting an entry from the free pool; also incorporates the calling of do_dod_run and the replenishment of the pools [Mike - in the process, I have removed some of your GC_DEBUG stuff; if this patch is applied, feel free to decide where you want to reinstate this] Essentially, the free pools have become the centre of the allocation process. If anybody has any problems with this basic concept, please let me know asap. The next stage will be the creation of a header arena and memory pool for constant strings. Files changed: resources.h, resources.c, memory.c -- Peter Gibbs EmKel Systems Index: include/parrot/resources.h === RCS file: /home/perlcvs/parrot/include/parrot/resources.h,v retrieving revision 1.27 diff -u -r1.27 resources.h --- include/parrot/resources.h 23 Apr 2002 19:41:15 - 1.27 +++ include/parrot/resources.h 29 Apr 2002 14:27:21 - @@ -44,6 +44,8 @@ void buffer_lives(Buffer *); +void Parrot_initialize_free_pools(struct Parrot_Interp *); + #define STRING_HEADERS_PER_ALLOC 128 #define PMC_HEADERS_PER_ALLOC 128 #define BUFFER_HEADERS_PER_ALLOC 128 @@ -77,6 +79,7 @@ struct free_pool { Buffer pool_buffer; size_t entries_in_pool; +void (*replenish)(struct Parrot_Interp *, struct free_pool *); }; struct Memory_Pool { Index: resources.c === RCS file: /home/perlcvs/parrot/resources.c,v retrieving revision 1.48 diff -u -r1.48 resources.c --- resources.c 27 Apr 2002 19:31:08 - 1.48 +++ resources.c 29 Apr 2002 14:25:19 - @@ -14,39 +14,77 @@ #include "parrot/parrot.h" -static void add_header_to_free(struct Parrot_Interp *interpreter, - struct free_pool *pool, void *to_add); - -/* Add a string header to the free string header pool */ -static void add_pmc_to_free(struct Parrot_Interp *interpreter, -struct free_pool *pool, void -*to_add) { - PMC **temp_ptr; - /* First, check and see if there's enough space in the free pool. If - we're within the size of a STRING pointer, we make it bigger */ - if (pool->entries_in_pool * sizeof(PMC *) >= - pool->pool_buffer.buflen - sizeof(PMC *)) { -/* If not, make the free pool bigger. We enlarge it by 20% */ -Parrot_reallocate(interpreter, - &pool->pool_buffer, - (UINTVAL)(pool->pool_buffer.buflen * 1.2)); - - } +/* Create a new free pool */ +static struct free_pool * +new_free_pool(struct Parrot_Interp *interpreter, size_t entries, + void (*replenish)(struct Parrot_Interp *, struct free_pool *)) { +struct free_pool *pool; +size_t temp_len; + +pool = mem_sys_allocate(sizeof(struct free_pool)); +temp_len = entries * sizeof(void *); +pool->pool_buffer.bufstart = mem_allocate(interpreter, &temp_len); +pool->pool_buffer.buflen = temp_len; +pool->pool_buffer.flags = BUFFER_live_FLAG; +pool->entries_in_pool = 0; +pool->replenish = replenish; +return pool; +} + +/* Add entry to free pool + * Requires that any object-specific processing (eg flag setting, statistics) + * has already been done by the caller + */ +static void +add_to_free_pool(struct Parrot_Interp *interpreter, + struct free_pool *pool, void *to_add) { +void **temp_ptr; + +/* First, check and see if there's enough space in the free pool. If + we're within the size of a pointer, we make it bigger */ +if (pool->entries_in_pool * sizeof(void *) >= +pool->pool_buffer.buflen - sizeof(void *)) { + /* If not, make the free pool bigger. We
Re: First patch to memory allocation routines
Dan Sugalski wrote: > 1) Has the external interface changed, and are you planning on having > it change? So far, no. mem_allocate will shortly need to be told what pool to allocate from; but I hope to remove this function from the external interface entirely. Other than that, it should just be the addition of the new functions; this will mostly impact string.c at present. > 2) How's the performance relative to the old stuff? According to my benchmarks, no change whatsoever at this stage; however, I would like somebody else to monitor this and let me know if things change significantly in either direction - Mike, would you mind helping out here? > 3) Do you want the memory manager/GC hat? At the moment, I've got a fair amount of free time, so I'm happy to help out as much as possible. However, this is a fairly crucial area, so I would like at least one other person to be keeping a close eye on what I'm doing, to try to avoid too many major disasters. Hopefully, once the current exercise is finished, we will have a pretty tight and stable memory management system, which shouldn't need much hand-holding. -- Peter Gibbs EmKel Systems
Memory management revamp - phase 2
Attached is the second set of patches for the tracked resource memory management system. Features: Separate memory pools for buffers, normal strings, constant strings; the latter is never compacted Each pool has its own alignment; these are currently set to 16, 4 and 2 respectively; suggestions welcome The current default allocations for these pools are: 16K, 32K and 8K respectively. Separate header pools for normal vs constant strings; the latter is never subjected to dead object detection String and Buffer processing combined as much as possible, since a string is really just a specialised buffer Most of the compile-time constants have been moved to variables; this will allow for dynamic changes to parameters API changes: I decided not to add new functions for allocating constant string headers and buffers. Instead, the constant flag must be set in the flags passed to string_make; these flags are now passed to new_string_header, and it will use the constant pool if requested. Currently, this only affects packfile.c, the patch for which is included herein. Note that changing the value of the constant flag during the life of a string could be detrimental to something. Other changes: A new opcode 'stringinfo' has been added to core.ops to allow access to the innards of a STRING, the list of arguments can be found in string.h. This currently serves very little purpose, but it will be useful for writing tests for COW, when this is added (shortly?) The code for 'open' in io.ops has been amended to use string_to_cstring instead of its own null-termination code; this code has mortality issues if termination of the mode string triggers GC; I have not bothered to handle that, as the correct solution is to pass Parrot strings directly to PIO_open. Still to do: The current handling of the free pool buffers in compact_buffer_pool is messy, and introduces another place that code needs to be added if a new resource type is added. I am currently thinking along the lines of some sort of 'system buffer' list, containing pointers to any buffer headers that are not in the normal arenas, but still need to be included in the collection process. COW will be added within a day or two; initially, just for full strings, as this doesn't need changes to anything outside string/resources String mortality still needs further work (e.g. the temporary transcoded strings created in string_concat, etc) Statistics collection may be moved to the pool level rather than the interpreter level, for possible use in dynamic tuning Documentation! Performance: This version is slightly faster in some cases; however, this is due to the increased memory allocation of the multiple pools, rather than any inherent improvements. -- Peter Gibbs EmKel Systems phase2.patch Description: Binary data
Re: Memory management revamp - phase 2
Steve Fink wrote: > > Why use 2-byte alignment on constant strings? Wouldn't it speed up > memory copies to make them always be 4-byte aligned? (Or > machine-register-size aligned?) Changed to 4-byte alignment for now. If there is a specific sizeof(X) you would prefer, just change it. > 1. Can you reformat to the coding conventions? They say that a Okay, I've finally gotten round to downloading the modules needed for run_indent.pl. I hand-synchronized with your previous patch; obviously I missed a few things. > 2. I think this line will break on some compilers. Casts are not > guaranteed to be lvalues: > > > +(char *)b += pool->unit_size; Changed to b = (Buffer *)((char *)b + pool->unit_size); > 3. Does this function need to be declared near the top in order for it > to have a chance to be inlined? You're right, of course. Moved to before the first call. > 4. This is just a question that I'm too lazy to read the code and > figure out for myself. Why are functions like compact_string_pool > needed? What does it do differently from the generic version for > Buffers? a) There is (as yet?) no linkage between memory pools and buffer header pools, so the compact function needs to know what headers to process b) COW may well be implemented for strings only (at least initially), this will require changes to the compact function There is code, e.g. the freeing of the old memory blocks, that is pure duplication; this can be moved to a separate common function. -- Peter Gibbs EmKel Systems phase2.patch Description: Binary data
Re: [APPLIED] Re: Memory management revamp - phase 2
Mike Lambert wrote: > Wow. Tinderbox gave us fall and spring in a few hour period. Nice. Here's > a patch to help make us make it to summer again. Steve actually pointed out that this style of code was not safe, but I still managed to miss that one. > -(char *)cur_buffer += pool->unit_size; > +cur_buffer += (Buffer *)((char *)cur_buffer + pool->unit_size); That should be > +cur_buffer = (Buffer *)((char *)cur_buffer + pool->unit_size); -- Peter Gibbs EmKel Systems Index: resources.c === RCS file: /cvs/public/parrot/resources.c,v retrieving revision 1.51 diff -u -r1.51 resources.c --- resources.c 5 May 2002 04:02:59 - 1.51 +++ resources.c 5 May 2002 05:28:36 - @@ -213,7 +213,7 @@ for (i = 0; i < pool->units_per_alloc; i++) { cur_buffer->flags = BUFFER_on_free_list_FLAG; add_to_free_pool(interpreter, pool, cur_buffer); -(char *)cur_buffer += pool->unit_size; +cur_buffer = (Buffer *)((char *)cur_buffer + pool->unit_size); } }
Copy-on-write strings
Although the tidying up of resources.c is not complete yet, I decided to implement COW strings anyway. This implementation handles the following: Copied strings and substrings are COWed instead of copied (i.e. new string header only); this applies to normal and constant strings. Functions that alter strings in-place make a new copy first if required (actually, in some circumstances copies are made when they could be avoided; this will be fixed shortly) The GC process will detect when a previously shared buffer is no longer shared, and clear the COW flag The COWing of substrings requires a major interface change: *** All references to the data in a STRING must use the new strstart pointer instead of bufstart *** This patch makes that change to current CVS code; all work-in-progress will need to be amended accordingly. Note that the JIT code has been broken in the process; I have left that for somebody who understands it. The 'string_tail' data added to the end of (non-constant) strings has been defined as a struct in case any other uses are found for it. The meaning of BUFFER_constant_FLAG has been changed slightly; it now means that the buffer resides in the (uncompacted) constant string pool; the header may be for a true constant or for a copy/substring of one. In the latter case, the BUFFER_COW_FLAG will also be set. A new function 'string_append' has also been added, for two-argument concatenation. I believe that this operation will be common enough to warrant an optimised function for it. Tests have been added to string.t for specific problems encountered during this exercise that did not cause any of the existing tests to fail. Additional tests for the functionality of the COW system will be added shortly. Performance seems to be somewhat improved for string-intensive programs (eg life). Because of the split pools, non-string buffers do not suffer any additional overhead, so performance of non-string-intensive programs should be unaffected. Feedback welcome. -- Peter Gibbs EmKel Systems phase3.patch Description: Binary data
Re: Copy-on-write strings
- Original Message - From: "Dan Sugalski" <[EMAIL PROTECTED]> > Actually, we don't. (Sez the man catching up on altogether too much > mail) Since we're putting the COW stuff at the tail end, substrings > of COW strings are fine. You set the bufstart to where the substring > starts, buflen set so it goes to the end of the original buffer, and > the string length bit holds the real string length. That way you > start where you need to, but you can still find the COW marker off > the end of the original buffer. > As I understand it, that means putting the original buffer length in the tail, and using true_bufstart = (char *)s->bufstart - (tail->buflen - s->buflen) in compact_string_pool? -- Peter Gibbs EmKel Systems
Re: Copy-on-write strings
Hi Nicholas The final design is now waiting on Dan, but it is always interesting to see other ideas. Like you, I rejected the parent/child technique. However, my proposed solution did not use any links at all, because it relies on the garbage collection system to determine when a shared buffer has only one reference. It worked as follows (ignoring substrings for simplicity of explanation): Each string header (the equivalent of a scalar in this regard, as it points to a specific string instance) points to the shared buffer The string header contains a COW flag, set in both source and destination headers when a string is [not] copied The buffer also contains a flag, used only during garbage collection and always initialised to zero When a string needs to be changed, we simply copy the data into a new buffer and clear the COW flag in the header During the memory pool compaction phase of the GC system, the first header encountered that points to a specific buffer will cause the buffer to be collected as normal; the special flag is set in the old copy of the buffer to say that this buffer has been relocated, and the address of this first header is stored in the old buffer. The COW flag is cleared at this time. If another header is found that points to a buffer that has already been moved, no copying takes place, and the header is simply corrected to point to the new location of the buffer. The first header has its COW flag set again. The result is that the last header of a COWed string will still believe that the buffer is shared until a GC collection run occurs, and therefore could result in buffers being copied unnecessarily. Your system eliminates this problem; however, I believe that Dan may be averse to using a linked list - we'll see. -- Peter Gibbs EmKel Systems
Re: Request for more string_replace() semantics
Melvin Smith wrote: > BUT... for the string_replace internal API it might still be worthwhile > for performance to add additional semantics so we don't have to > next stuff for performance. > > I'm open to adding a dest_len arg or something, but consider this a > request for comments on any additional string semantics that we might > be missing that I'm unaware of. My concern here would be where to stop. If string_replace allows a substring for its replacement string (incidentally, that would be 2 extra arguments, for offset and length), then perhaps string_concat should allow both its arguments to be substrings, and so on. If we look at the code in pasm (excuse my rearrangement of the original post): > my $orig = "123123""; > my $rep = "456456"; > > If I want orig to be 123456 I have to do > > substr($orig, 3, 3, substr($rep, 0, 3) ); set S0, "123123" set S1, "456456" substr S31, S1, 0, 3 substr S0, 3, 3, S31 As compared to the proposed: set S0, "123123" set S1, "456456" substr S0, 3, 3, S1, 0, 3 There are three things that cause the existing code to be slower: 1) Creating a new string header for the substring 2) Copying the data 3) The interim header stays in use until S31 is reused and a DOD run occurs Point 1 can be helped by reducing the cost of creating headers; I have done some work in this area which I hope to post soon. Point 2 is probably the least significant cost; but it can be avoided by COW. Point 3 contributes greatly to point 1, as the header creation is fairly cheap if there are free headers available. The best option I have thought of for helping here, is to allow voluntary explicit disposal of unwanted resources, i.e. the addition of 'free S31' into the above code, since the compiler knows it is not needed after execution of that statement. This would return the buffer header to the free pool, and clear the register. The actual string data would only be reclaimed by GC. This is not a trivial task, as there is currently no method of determining what resource type an S register points to; however a hacked-together test I did a few weeks ago indicated about a 5% improvement in 'life' with just this change. Basically, I am suggesting that we put more effort into tuning the basic engine, rather than add additional opcodes/functions to improve performance of specific high-level functions. If, when we get to the stage of running real code, we find a specific operation is running too slowly, then we can look at specialised handling for it. Incidentally, try the following little program sometime: substr S0, "constant", 0, 8, "variable" print "This is a " print "constant" print " constant.\n" end This can be disallowed at assembly time (see patch below); but, as a general rule, where should we be preserving our constants? -- Peter Gibbs EmKel Systems Index: core.ops === RCS file: /home/perlcvs/parrot/core.ops,v retrieving revision 1.137 diff -u -r1.137 core.ops --- core.ops 15 May 2002 05:01:15 - 1.137 +++ core.ops 15 May 2002 15:45:48 - @@ -1777,7 +1777,7 @@ goto NEXT(); } -inline op substr(out STR, in STR, in INT, in INT, in STR) { +inline op substr(out STR, invar STR, in INT, in INT, in STR) { $1 = string_replace(interpreter, $2, $3, $4, $5, &$1); goto NEXT(); }
Re: [netlabs #579] [PATCH] hash test showing bug in concat()
Steve Fink wrote: > This is a patch to the hashtable test suite, but the bug looks like it > might be in string.c: > a is coming in marked as UTF-32, which seems incorrect (I can view > a->bufstart as a plain char*, and it has more than one letter in it.) > b comes in as usascii, which is correct. b gets transcoded to UTF-32. > Then when string_make() is called, b's bufstart seems to get wiped > out. > /* transcode first so we know the length and avoid infanticide */ > if (a->type != b->type || a->encoding != b->encoding) { > b = string_transcode(interpreter, b, a->encoding, a->type, > NULL); > } > result = string_make(interpreter, NULL, a->bufused + b->bufused, > a->encoding, 0, a->type); The first problem sounds similar to one I noticed a while back, where constants were not getting the right type; but then they were appearing as Unicode/singlebyte, so something must have changed since - I haven't looked at it recently. The second problem is a known infant mortality problem; a previous patch fixed one cause of it (which is why the comment is there), but another problems remains - see http://www.mail-archive.com/perl6-internals@perl.org/msg09311.html for a post on this subject, which never elicited any response. -- Peter Gibbs EmKel Systems
More memory management changes
The attached patch is the next set of proposed changes to the memory management routines, with the copy-on-write logic removed. Performance numbers on my 166-MHz Pentium (linux 2.2.18) for 5000-generation life.pasm are: Current CVS - 50.41 generations/second With this patch - 57.40 With COW & string_append - 65.43 The main features are: Revised 'live' marking system with removal of mark_X_unused functions Initial framework for dynamic allocation algorithms Incorporation of the free pools into the standard buffer header pool, to avoid special handling in compact_buffer_pool Details The 'live' flag is now only valid during a DOD run. It is cleared when an object is created and again at the end of the DOD run. This means that the overhead of a separate pass through the pools to clear the flag at the start of a DOD run is eliminated. For PMCs, the next_for_GC field is treated in the same way - initialised to NULL in new_pmc_header, and reset to NULL in free_unused_PMCs. The 'on_free_list' flag is used instead to identify live buffers during memory compaction. The initial allocation for each pool has been reduced from 128 to 16, this is obviously an arbitrary number, but only once we have a large set of benchmarks will be able to calculate the best 'average' values. However, each time a new allocation is required, the number of units allocated is increased; controlled by parameters currently defined at the top of resources.c. Again, these numbers just seemed reasonable in some very simple tests, and need more tuning. Also, instead of waiting for a pool to be empty before refilling it, each pool has a replenishment level, at which a new block will be allocated. These replenishment levels currently start at half the original pool size, and increase with every allocation, but at a slower rate. This helps reduce the thrashing that can otherwise occur when the pool oscillates between empty and almost empty. These rules are mainly to introduce the concept of dynamic response, hopefully suggestions/implementations of better algorithms will follow in due course. The free pool buffer headers were being held as independent Buffer structures within the Resource_Pool; this meant that they had to be separately handled in compact_buffer_pool, which looked a mess, and made it easy to forget things. They are now held as pointers, and allocated through the normal procedure, except for the first one. The normal buffer pool's free pool header is first allocated from system memory, so that the pool can be initialised. Then, a new buffer header is allocated in the standard way, and takes over from the initial one, which is returned to the system. This change required the addition of an 'immune' flag for buffers. I suspect that Steve's recent changes, which make variable-sized buffer headers simple, will allow other internal structures to be handled using the buffer mechanism, and so the immune flag could be generally useful anyway. [Steve - it seems to me that the 'normal' buffer pool should just be replaced by the size 0 pool in your new system? I would think twice about incorporating strings, as that might complicate COW, if it ever happens.] -- Peter Gibbs EmKel Systems combined.patch Description: Binary data
Re: Test op/stacks:29 dying.
Chris Ball wrote: > t/op/stacks.ok 28/29#Failed test (t/op/stacks.t at line 592) > # got: '' > # expected: '43210-1 This test has a missing 'end' in the code and segfaults. Patch below. -- Peter Gibbs EmKel Systems Index: t/op/stacks.t === RCS file: /home/perlcvs/parrot/t/op/stacks.t,v retrieving revision 1.15 diff -u -r1.15 stacks.t --- t/op/stacks.t 15 May 2002 05:01:32 - 1.15 +++ t/op/stacks.t 15 May 2002 20:09:30 - @@ -617,6 +617,7 @@ print I1 print "\n" + end CODE 43210-1 OUTPUT
[netlabs #583] Re: Test op/stacks:29 dying.
# New Ticket Created by "Peter Gibbs" # Please include the string: [netlabs #583] # in the subject line of all future correspondence about this issue. # http://bugs6.perl.org/rt2/Ticket/Display.html?id=583 > Chris Ball wrote: > t/op/stacks.ok 28/29#Failed test (t/op/stacks.t at line 592) > # got: '' > # expected: '43210-1 This test has a missing 'end' in the code and segfaults. Patch below. -- Peter Gibbs EmKel Systems Index: t/op/stacks.t === RCS file: /home/perlcvs/parrot/t/op/stacks.t,v retrieving revision 1.15 diff -u -r1.15 stacks.t --- t/op/stacks.t 15 May 2002 05:01:32 - 1.15 +++ t/op/stacks.t 15 May 2002 20:09:30 - @@ -617,6 +617,7 @@ print I1 print "\n" + end CODE 43210-1 OUTPUT
Re: Difference between memory and string pools
Hi Mike > We have three memory_pools: memory, string, and constant_string. Is there > any reason that string and memory are distinct? (aside from code > simplicity). > Seems to create a lot of duplicate logic with the need to support a string > version and a generic buffer version of everything, when they are so > similar. One difference is that my private version of compact_string_pool contains code for COW strings, which means the actual compaction code is significantly different. Steve Fink has suggested merging the two with the use of flags to differentiate the behaviour; obviously, with no COW code in the live version, compact_string_pool could just be scrapped, and the string header pool added (index -2 ?) to compact_buffer_pool. Another difference, instigated by Dan when he proposed this split in the first place, is that the alignment factor for the string pool is smaller than that for the buffer pool. A side effect of the split pools is targeted compaction i.e. the buffer pool gets compacted only when it is full, and the string pool likewise - this does provide a performance benefit in situations where one pool grows regularly and the other does not. If we were to eliminate the separation of pools based on type, perhaps we need to consider some other means of having multiple pools, purely for performance reasons. This could, for example, be based on the buffer header size, so each resource pool would have its own memory pool. -- Peter Gibbs EmKel Systems
Re: More memory management changes
The COW patch has been revised for minimal impact to the outside world. The only files changed are: resources.h, resources.c, string.h, string.c There is one API change: Parrot_reallocate_string has an additional parameter. However, nobody outside string.c really has any business calling that anyway; and currently nobody else does. The STRING structure has one change: a new field, void *buffer, points to the start of the physical buffer allocation. I tested a version that eliminated this requirement, by placing the allocated buffer length in the string tail; but this made the code in compact_string_pool very cumbersome, and the performance was several percent slower than my current implementation. Once the latest memory management patch is in, I will resync and submit the COW code as a separate patch, and we'll see what everybody thinks about it. I have just done a benchmark using Melvin's reverse.cola program. Reversing string.c went from 3.42 user seconds down to 3.12 with the COW patch; while reversing core_ops.c went from 94.39 to 92.26. Since this program does a large amount of stack manipulation, I suspect that the much smaller percentage improvement achieved on the larger file is due to stack overhead - I haven't had a change to profile it yet, but I will do as soon as I can. Unrelated subject: I see that Steve has modified string.c to get around the problem with the temporary transcoded strings - I believe we need to finalise a standard way of handling these situations soon. If anybody is interested, I will resync my previous 'neonate' patch - it needs a bit of work to fit with the latest changes to resources.c -- Peter Gibbs EmKel Systems
Re: GC design
I submitted a patch to implement the initial parts of Dan's method some time ago, which has never been applied. (Although Dan recently agreed that it needed to be done, and I will be updating it shortly in line with changes to the memory management system) However, thinking about it some more, there may be a way to avoid the overhead of the separate cleanup phase, if we enforce the rule that any objects created during a single opcode must be rooted at the end of the opcode or be disposable, and apply a variation of Melvin's generations: Add a counter to the interpreter structure, which is incremented every opcode (field size would not be particularly important) Store this counter value in every new object created, and set the 'new object' flag (by doing this on every object, we remove the requirement for the creating function to be aware of what is happening) If an object is encountered during DOD that claims to be new, but was not created during the current opcode, dispute the claim. If the counter has exactly wrapped in the meantime, an object might survive longer than it should. Advantages Totally transparent outside of resources.c No additional passes required Objects will be reclaimed during the first DOD run after the creating opcode Disadvantages Additional field in PMC and Buffer headers Per-opcode counter required therefore fixed performance degradation Slightly more expensive logic to detect dead objects during DOD run Comments, anyone? I agree that something needs to be done quite urgently. As an example, Steve's current fix to string_concat (disable DOD) causes performance problems: Disabling DOD during string_concat: 5000 generations in 101.397038 seconds. 49.311105 generations/sec A total of 360896 bytes were allocated A total of 1501 DOD runs were made A total of 1474 collection runs were made Copying a total of 28677296 bytes There are 4770 active Buffer structs There are 14464 total Buffer structs With this change removed (life doesn't trigger the bug): 5000 generations in 97.344093 seconds. 51.364185 generations/sec A total of 57344 bytes were allocated A total of 101397 DOD runs were made A total of 5418 collection runs were made Copying a total of 7957080 bytes There are 102 active Buffer structs There are 512 total Buffer structs This highlights something that needs to be fixed anyway - blocked DOD runs are effectively cancelled, whereas they should just be postponed. -- Peter Gibbs EmKel Systems
[netlabs #601] [PATCH] Simplified version of temporary buffer immunity patch
# New Ticket Created by "Peter Gibbs" # Please include the string: [netlabs #601] # in the subject line of all future correspondence about this issue. # http://bugs6.perl.org/rt2/Ticket/Display.html?id=601 > Attached is a simplified version of a previous patch to allow buffers to avoid collection during their formative nanoseconds. This version covers the basics: A new flag - BUFFER_neonate_FLAG This flag causes immunity from collection during DOD runs The flag is set and cleared as required in string.c (note that there may be more places that need flag setting, I just did a few obvious ones) This version makes no attempt to track the fact that there are newborns, or to kill them if they try to keep this status for too long. It therefore needs to be specifically used only when required. I am proposing this as an interim solution until we decide on the best way of handling it permanently, to overcome the current bugs without resorting to the performance-impacting method of suppressing DOD runs. A few warnings in string.c have been removed in the process, along with some duplicate #define's in string.h -- Peter Gibbs EmKel Systems
[netlabs #602] [PATCH] Protect pack opcode from suicidal infants
# New Ticket Created by "Peter Gibbs" # Please include the string: [netlabs #602] # in the subject line of all future correspondence about this issue. # http://bugs6.perl.org/rt2/Ticket/Display.html?id=602 > Attached patch to core.ops implements neonate protection for the 'pack' opcode. The flag doesn't need to be cleared, since string_destroy handles this. -- Peter Gibbs EmKel Systems
Re: Hashtable+GC problems
Steve Fink wrote: > Is there some way to make the default be that things will not get > moved around, and allow routines to volunteer to have their guts > scrambled if they know how to handle it? A few random thoughts re buffers that don't wander around on their own: Create a new memory pool for immobile buffers (yes, we have a use for that flag at last) - I don't think we need different resource pools Immobile memory pools would have a new type of compactor function, that would maintain a linked list of free space (I suspect this would require multiple passes) Allocation would occur using one of the old-fashioned algorithms like first-fit This scheme may use more memory than copy-collection, due to fragmentation, and would obviously be slower for memory allocation, since it has to traverse a list of free blocks. In effect, it would be equivalent to using the operating system's memory allocation routine, except that we still retain GC control over it. However, being more expensive than the current Parrot allocation scheme may still be cheaper than a different homegrown solution every time somebody needs a buffer to stay put - the idea would be to use immobile buffers only when no more elegant solution is available for a specific problem, rather than have them as the default - most buffers should not be location sensitive? Alternatively, we say that buffers are, by definition, not location sensitive. Then we force the use of PMCs in those situations, and re-implement the original concept of telling a PMC to move its data to a new location. That would require scanning PMCs during memory pool compaction - at which point we are pretty close to unification of buffers and PMCs anyway?? Comments? -- Peter Gibbs EmKel Systems
[netlabs #607] [PATCH] COW strings (again)
# New Ticket Created by "Peter Gibbs" # Please include the string: [netlabs #607] # in the subject line of all future correspondence about this issue. # http://bugs6.perl.org/rt2/Ticket/Display.html?id=607 > Attached is another patch to implement copy-on-write strings. Summary of changes * New field 'buffer' in STRING points to the start of the allocated buffer. I have put this field immediately after the Buffer fields, in case we want to create a generic subclass of Buffer with the same functionality * The existing 'bufstart' field is redefined as pointing to the start of the string data, this is what all code outside the memory manager expects anyway * Similarly, buflen means the distance from bufstart to the end of the buffer, rather than the physical size of the buffer * New struct String_Tail, added at the end of each string buffer, for private use by the garbage collector * Parrot_(re)allocate_string changed to add the tail for non-constant strings * Parrot_reallocate_string now takes an additional parameter to specify the flags required for the new buffer * compact_string_pool handles COW strings appropriately; the logic caters for both full strings and substrings, and clears the COW flag on buffers that become unshared * string_copy and string_substr changed to create COWs, note that the 'const' attribute has been removed from these, as well as string_transcode, since the COW process sets a flag in the source string * string_replace changed significantly, to use COW features where possible These changes should have no impact on any code outside of string.c and resources.c Performance should be improved for string-intensive processing in most cases; for example, my tests show approximately a 5% improvement in life.pasm. The next patch will be some documentation for the memory management system, if I can figure out how it works. -- Peter Gibbs EmKel Systems
Re: [netlabs #607] [PATCH] COW strings (again)
> I think I'd like these differently. STRING is a subclass of Buffer, > and the bufstart and buflen fields in Buffers point to the actual > start of memory and length of allocated data. I'd rather keep it that > way, and have the 'real' string length and string start be extra > fields in the STRING part. Quote from http://www.mail-archive.com/perl6-internals@perl.org/msg09497.html: >The COWing of substrings requires a major interface change: >*** All references to the data in a STRING must use the new strstart pointer >instead of bufstart *** Actually, we don't. (Sez the man catching up on altogether too much mail) Since we're putting the COW stuff at the tail end, substrings of COW strings are fine. You set the bufstart to where the substring starts, buflen set so it goes to the end of the original buffer, and the string length bit holds the real string length. That way you start where you need to, but you can still find the COW marker off the end of the original buffer. [End quote] -- Peter Gibbs EmKel Systems
Re: [netlabs #609] Replenish-Level Simplification
Mike Lambert wrote (via RT) > Below code implements REPLENISH_LEVEL_FACTOR, which is a percentage fro 0 > to 1 which indicates at what point it will allocate more headers. Also > gives us a speedup of roughly 1.5% :) Thanks for the patch, it certainly simplifies things. I went around in several circles playing with different algorithms, then found a bug elsewhere in my code that meant the replenishment logic wasn't working properly anyway, so I fixed that and just left the replenishment stuff as it ended up. However, your logic requires that you back-calculate the total pool size - I think we might as well just add that to the Resource_Pool struct, as it sounds like a useful number to know. Then replenish_level can be scrapped and just calculated when required. Perhaps the factor could then be made per-pool also, in case we want to be able to tune it for different resource types - what do you think? -- Peter Gibbs EmKel Systems
Re: [netlabs #613] Parrot BASIC SEGV's with much string handling
Clinton A. Pierce wrote: > * sync up, and get the latest Parrot BASIC. It's fully hash-enabled and > quite speedy now. > > * Run "basic.pl" to assemble the interpreter, and get it started > > * At the "Ready" prompt, "LOAD eliza" > > * When finished, type RUN > > The crash will happen shortly thereafter. Comments in #parrot seem to The problem appears to originate in the routine SFETCH, which tries to fetch key "TWIRL" from the hash in P21 without finding it. The hash code returns a null pointer under these circumstances, which is accepted as an empty string by some, but not all, string handling code. In this case, attempting to store the string into an array is invoking string_copy, which assumes the input to be valid. IIRC Dan stated some time ago that checks for null were not to be included, but I don't know what the current status is. -- Peter Gibbs EmKel Systems
Re: LZW in pasm
Sean O'Rourke wrote: > >I took a look at what was going on, and found that the GC probably needs a > >good tuning. For the 20K file, parrot is doing 217 collections of the > >string pool, the last 102 of which reclaim less than 10% of the pool. > >Changing compact_string_pool() to increase the pool size by a factor of > >(0.5 - pct_freed_last_time) if it reclaimed less than 50% of memory > >reduced this to 21 collections, much fewer of which reclaimed abysmally > >small amounts of memory. And Dan Sugalski replied: > Interesting. Could you work with Peter Gibbs and see what the two of > you can do to come up with an adaptive collection system? That's the > best way to go in the long term--the current system's functionally > complete, but far from well tuned. I am currently busy working on a variation of this theme - I was planning to get COW strings in before starting on the memory pool tuning phase, but I have put that on hold for a while to make sure I get it right next time. My current work is based on counting the size of freed buffers - yes, this adds some overhead in free_unused_buffers, but it seems like it may be worth it - and only doing compaction runs when a predefined fraction of the pool size is available for freeing. I am also looking at implementing an inflation factor similar to Sean's suggestion above. Another option is to allow the block size used to expand the pool to grow, so if the pool is growing rapidly, we will allocate less larger blocks; this keeps the linked list shorter, and means less calls to the operating system. Sean's program is ideal as a benchmark for this (after I removed all the shl's before the pack's, otherwise I just get files full of nulls). My latest test runs show an improvement on my system as follows: zip.pbc, input = string.c (23688 bytes) Current CVS version: 85.30 seconds (it's only an old 166MHz Pentium I) Latest test version: 1.77 seconds I have also been playing with variants of Mike Lambert's proposal for simplifying the resource pool tuning, so I don't have a patch ready yet; something will definitely be out sometime over the weekend. As usual, all suggestions are most welcome. -- Peter Gibbs EmKel Systems
Re: hash values and comparisons of strings
Steve Fink wrote: Reading this made me wonder if we should consider cached string transcodings, if we don't end up storing strings in a single form internally. The worst case is probably string constants, which could be transcoded over and over again into the same alternate encoding. As an extension of this, the hashing algorithm could be deemed to be another encoding i.e. the hashed value of a string could also be cached. Trying to decide when a cache entry was no longer needed could be a little bit tricky, but it might be worth giving some thought to. Interestingly, the current implementation of string_compare first transcodes if required, then proceeds to do a codepoint-by-codepoint comparison loop anyway. If a new vtable entry was added to do a single codepoint extract and transcode, i.e. extract_unicode or some such, then the transcodings could be removed. This would be best handled using iterators, as the current two vtable calls (decode, then advance) would then become one. Also, if the encodings and chartypes match, a straight memory compare could be used - this is what we would want for opaque data anyway. As far as the GC goes, it looks like what you really need for the hash structures is immobile memory. This can almost be done at the moment by resorting to using system-level allocation (checking the code I see that BUFFER_sysmem_FLAG memory will not be correctly released at present; this will be fixed). Alternatively, the ability to lock a specific buffer for a short period might be useful; this is somewhat difficult to implement with a copy-collection system, but not impossible. In the meantime, we can introduce functions in resources.c to handle the blocking/unblocking of collections instead of direct manipulation; then an attempt to collect while blocked could set a flag, and the unblocking would trigger the blocked collection - this should reduce the impact of blocking. -- Peter Gibbs EmKel Systems
Re: [COMMIT] Configure.pl 2.0
For now, just delete config_h.in from the STICKY_FILES line in config/gen/makefiles/root.in and re-run Configure.pl I'm sure Brent will sort it out properly later. -- Peter Gibbs EmKel Systems
Re: crash problem with PerlInt
Jens Rieks wrote: >I've used PerlInt instead of PerlString by mistake. >The error occurs if code like this is executed a few times: >("data" is a PerlInt) > >find_global P1, "data" > set S1, P1 ># > set P1, S1 > >It works a few thousend times, then it stops working >often resulting in a crash. >The code works like expected when >- using PerlString instead of PerlInt or The PerlInt set_string code currently tries to transform the PMC into a PerlString, however it does not do a complete job. Specifically, it does not set the flag PMC_is_buffer_ptr_FLAG, and therefore the string is not traceable from the root set. This is easy to fix, but I believe (hope?) that this whole transmogrification thing will change anyway. The reverse problem also applies - if you create a PerlString, then set it to an integer, the flag stays set; this could also cause unpredictable behaviour. For now, just assume that any attempt to turn one type of PMC into another is likely to have undesirable consequences, and you should be safe. -- Peter Gibbs EmKel Systems
[netlabs #619] [PATCH] Memory manager/garbage collector speedup (sometimes)
# New Ticket Created by "Peter Gibbs" # Please include the string: [netlabs #619] # in the subject line of all future correspondence about this issue. # http://bugs6.perl.org/rt2/Ticket/Display.html?id=619 > The attached patch improves performance for programs that allocate an increasing amount of string (or other buffer) memory, rather than extensively manipulating a small amount of data. It works by delaying memory pool compaction until a preset fraction (currently set to 20%) of the total memory allocated to that pool can be reclaimed; this significantly reduces the number of compaction passes on programs that allocate a lot of memory without releasing it soon thereafter. Programs that don't fall into this category may experience a slight performance decrease due to the increased overhead of tracking the size of the reclaimable data. The next round of updates will aim at further reducing the number of DOD runs, which should regain performance in programs that create and delete a large number of objects. Benchmark Before After life.pasm, 5000 generations 86.7787.87 zip.pasm, jit_cpu.c (56641 bytes) 52.02 4.22 reverse.cola, jit_cpu.c8.24 5.14 hanoi.pasm, 14 discs 31.4231.80 All programs are run with output suppressed (life) or redirected to /dev/null I would be very interested in getting benchmark results from other platforms (above are from a Pentium 166MHz, 96MB, linux 2.2.18) -- Peter Gibbs EmKel Systems
Re: LZW in pasm
Sean O'Rourke wrote: > But really I think the way to go would be a generational collector -- after > we've collected a pool once or twice (or after we've failed to recover > more than x% from it on the previous run), we put it off to the side and > compact (indeed, look at) it less often. New objects are allocated from a > new pool, since they are less likely to live past the next collection than > ones that have already been around awhile. For now I am going to continue trying to keep the current system functioning reasonably efficiently. As long as we keep the API small and simple, we should be able to plug in a totally redesigned system later. We are getting to the stage now where Parrot is becoming fairly usable, although the PMC architecture still needs some revision, and therefore should be able to start getting a better idea of the demands that are going to be made on the memory manager. I have briefly read the CMM paper, and have downloaded the code to have a proper look at it sometime. When I get further along with the design process of the next generation I will post it for comments; in the mean time please feel free to contribute any ideas you have (or, better still, just write the thing - I really don't mind!) -- Peter Gibbs EmKel Systems
Re: [netlabs #619] [PATCH] [REPOST corrected] Memory manager/garbage collector speedup (sometimes)
I managed to send the wrong version of the patch on the previous post! Herewith the correct (I hope) one. Apologies to all. -- Peter Gibbs EmKel Systems reclaim.patch Description: Binary data
Re: quicksort in pasm
"brian wheeler" wrote: > I've written a quicksort which will sort a file on stdin. It could go > into examples, I suppose. > Building parrot with -pg makes it run 5s slower. The gprof -T dump of > the timings looks like this: I find it useful, if running on a system that normally uses the computed goto core, to supply the -g switch to parrot, to tell it to switch to the standard core instead. That way, you don't get a big block of time just allocated to cg_core. I was starting with a very simple test to decide how to determine where the memory overuse was coming from, but I noticed some unusual results: $ cat test.in aa ee dd cc bb $ ./parrot qs.pbc < test.in Starting quicksort with 0 and 5 Starting quicksort with 0 and 1 Starting quicksort with 0 and 0 Starting quicksort with 2 and 1 Starting quicksort with 3 and 5 Starting quicksort with 3 and 4 Starting quicksort with 3 and 3 Starting quicksort with 5 and 4 Starting quicksort with 6 and 5 dd aa bb ee cc Am I missing something somewhere? -- Peter Gibbs EmKel Systems
Re: GC design
"Mike Lambert" wrote: > I know Dan's proposed solution has already been committed, but I'd > like to add my support for this. In addition to providing a counter per > opcode to ensure that we don't prematurely GC data, this field could also > be reused to calculate the generation of a particular buffer, to help sort > it into a correct generational pool (if/when we do get that). I have almost convinced myself this is essential, due to the problem I am currently fighting (the excessive memory usage in programs like zip and quicksort - see below for more explanation). Unfortunately, there is a fairly major performance impact (on my system, anyway). Perhaps a few volunteers could add the patch below and do before and after mops, life, whatever benchmarks and post the results? This code just adds the overhead of the cycle counter without any benefits. The excess memory usage problem is caused partially because of the decoupling of dead object detection and memory pool compaction. Some of the recent changes have been aimed at improving performance by reducing the frequency of DOD runs, since these are quite slow. However, we can then get the situation where a large number of string allocations take place without running out of free headers, so we never do a DOD run; compaction runs therefore can't reclaim anything, because no strings have been declared dead recently. My preferred solution to this is to trigger DOD runs during memory allocation, to see whether we can find reclaimable memory - this however presents an infant mortality nightmare, which could be addressed using the birthdate technique. As an alternative to this, I am currently testing a scheme using flags, so that the memory allocation system can request that a DOD run occur at the next suitable opportunity, and possible vice versa. > Another proposal is to walk the C stack. In runops (or some similar This is more in line with standard GC systems, which have no option since the program is assumed to be non-cooperative. It would probably require a major rewrite of the memory manager, so, although it needs to considered, I think it should become part of a separate project to design a next generation memory management system from scratch, which can be plugged in later - unless later has to be sooner to accomodate problems that can't be resolved, albeit not optimally, in the current version. -- Peter Gibbs EmKel Systems Index: include/parrot/interpreter.h === RCS file: /home/perlcvs/parrot/include/parrot/interpreter.h,v retrieving revision 1.45 diff -u -r1.45 interpreter.h --- include/parrot/interpreter.h15 May 2002 07:25:37 - 1.45 +++ include/parrot/interpreter.h26 May 2002 08:57:58 - @@ -162,6 +162,7 @@ requests are there? */ UINTVAL GC_block_level; /* How many outstanding GC block requests are there? */ +UINTVAL cycle; /* Opcode cycle counter */ } Interp; #define PCONST(i) PF_CONST(interpreter->code, (i)) Index: lib/Parrot/OpsFile.pm === RCS file: /home/perlcvs/parrot/lib/Parrot/OpsFile.pm,v retrieving revision 1.24 diff -u -r1.24 OpsFile.pm --- lib/Parrot/OpsFile.pm 12 May 2002 18:57:43 - 1.24 +++ lib/Parrot/OpsFile.pm 26 May 2002 08:59:42 - @@ -325,6 +325,9 @@ $body =~ s/\$(\d+)/{{\@$1}}/mg; +# GC Experimental +$body = " interpreter->cycle++;\n{\n" . $body . "}"; + if ($ENV{PARROT_NO_LINE}) { $op->body($body); } else {
[netlabs #628] [PATCH] Make hash.c depend on parrot headers
# New Ticket Created by "Peter Gibbs" # Please include the string: [netlabs #628] # in the subject line of all future correspondence about this issue. # http://bugs6.perl.org/rt2/Ticket/Display.html?id=628 > Following patch adds dependencies entry for hash.c to Makefile. Stops things from getting very confusing when changing layout of String and/or Buffer structs! -- Peter Gibbs EmKel Systems Index: config/gen/makefiles/root.in === RCS file: /home/perlcvs/parrot/config/gen/makefiles/root.in,v retrieving revision 1.2 diff -u -r1.2 root.in --- config/gen/makefiles/root.in24 May 2002 16:34:19 - 1.2 +++ config/gen/makefiles/root.in27 May 2002 10:23:12 - @@ -292,6 +292,8 @@ pmc$(O) : $(GENERAL_H_FILES) +hash$(O) : $(GENERAL_H_FILES) + jit$(O) : $(GENERAL_H_FILES) jit_cpu$(O): $(GENERAL_H_FILES) $(INC)/jit_emit.h
[netlabs #629] [PATCH] Memory manager/garbage collector - major revision
# New Ticket Created by "Peter Gibbs" # Please include the string: [netlabs #629] # in the subject line of all future correspondence about this issue. # http://bugs6.perl.org/rt2/Ticket/Display.html?id=629 > Attached patch does some fairly radical things to the memory manager. 1) Cycle counter added to interpreter structure; incremented by opcodes that have something vaguely resembling a function call in their body. 2) Buffer-like objects have their date of birth (cycle counter value) recorded 3) Neonate flag is deprecated; functionality replaced by the above i.e. no buffer header will be freed during the same opcode as it was created 4) Free pool array replaced by a linked list within the arenas 5) Minimum/maximum ratio of memory pool physical size to used size controlled by new tuning parameters (fixed #define's for now) 6) DOD run performed during memory allocation if compaction will not reclaim enough memory The 'function-call like' logic (which is very primitive at present) for adding the cycle count increment into an op's body means that most simple opcodes should not incur additional overhead. Once an op starts calling functions, the additional overhead of a single increment is likely to be minor. After much thought and playing around, this seems to be the best answer to the problem of infant mortality. Dan's suggestion of reversing the logic of DOD runs would work for the pure infant mortality situation (except perhaps for the odd pathological op), but it still leaves problems with the dependency of buffer memory collection on prior dead object detection. These changes do cause a slight performance degradation, but I believe it is worth it for the overall simplification of transparent protection of the newborn. Performance can only be a secondary goal, after correct behaviour. The runaway memory usage of programs like zip and quicksort has hopefully been resolved; I have successfully run zip.pasm against core_ops.c. Note that the neonate flag has not been removed; if this patch is applied, all use of the neonate flag can be removed, followed by the flag itself. The flag is not used in the new GC logic. The problem of having buffer memory move around still remains; that should be addressed shortly. -- Peter Gibbs EmKel Systems
[netlabs #691] [PATCH] Documentation update
# New Ticket Created by "Peter Gibbs" # Please include the string: [netlabs #691] # in the subject line of all future correspondence about this issue. # http://bugs6.perl.org/rt2/Ticket/Display.html?id=691 > Index: RESPONSIBLE_PARTIES === RCS file: /cvs/public/parrot/RESPONSIBLE_PARTIES,v retrieving revision 1.1 diff -u -r1.1 RESPONSIBLE_PARTIES --- RESPONSIBLE_PARTIES 23 May 2002 18:24:33 - 1.1 +++ RESPONSIBLE_PARTIES 7 Jun 2002 07:21:02 - @@ -5,6 +5,5 @@ Project lead Jeff Goff JITDaniel Grunblatt -GC/Memory Management Peter Gibbs Configure Brent Dax I/OMelvin Smith -- Peter Gibbs EmKel Systems
Re: [COMMIT] GC Speedup
"Mike Lambert" wrote: > I've just committed some changes tonight that improved performance > on the GC-heavy tests in examples/benchmarks/ by about 8%. Thanks Mike, those changes do indeed help. Current numbers on my system for 5000 generations of life with various versions of Parrot (using CVS tags) are: 0.0.5 47.99 generations per second 0.0.6 57.41 0.0.7 20.18 current 21.18 (an improvement of 4.7% over 0.0.7) I suspect my system (166MHz Pentium, linux 2.2.18) may be below the target threshold. I will repeat these tests within a day or two on a faster machine and see what happens. -- Peter Gibbs EmKel Systems
Re: [COMMIT] GC Speedup
"Mike Lambert" wrote: > - addition of stack-walk code to avoid child collection > - the GC refactoring I commited > I suspect the former is what is causing your speed hit, although I'm not > ruling out the possibility that my changes caused a problem as well. Can > you disable the trace_system_stack call and re-run these numbers? With the call to trace_system_stack commented out in dod.c, I get 48.5 generations per second. The full stats are: 5000 generations in 103.185356 seconds. 48.456488 generations/sec A total of 36608 bytes were allocated A total of 42386 DOD runs were made A total of 6005 collection runs were made Copying a total of 72819800 bytes There are 21 active Buffer structs There are 1024 total Buffer structs This compares to the 14th July CVS version: 5000 generations in 81.172149 seconds. 61.597482 generations/sec A total of 58389 bytes were allocated A total of 160793 DOD runs were made A total of 1752 collection runs were made Copying a total of 1228416 bytes There are 81 active Buffer structs There are 192 total Buffer structs > I know that there are faster solutions to the problem of child collection, > but Dan doesn't want to use them due to the problems that occur when we > start using exceptions (and longjmp, etc). If performance has to halve in order to implement such features, I hope somebody plans to write Parrot::Lite! -- Peter Gibbs EmKel Systems
Re: Unifying PMCs and Buffers for GC
Mike Lambert wrote: > I'm currently favoring allowing for header pools on a per-type basis, not > just a per-size basis. This would give us a 'hash' pool. The pool > structure would contain function pointers for collection and/or dod > purposes. (stuff that would otherwise be in a PMC vtable.) I am very much in agreement with this concept in principle. I would like you to consider adding a name/tag/id field to all pool headers, containing a short text description of the pool, for debugging purposes. > > > One idea, which is most closely in line with the current semantics, is to > add a pool pointer to every header. I've found a few times in the past > where such a pointer would have come in handy. This would allow us to call > the pool's mark() function, to handle stuff like pointing-to-buffers, etc. This is something I have done in my personal version, for buffer headers only at present (I have been mainly ignoring PMCs, as I believe they are still immature). I use it for my latest version COW code, as well as to allow buffer headers to be returned to the correct pool when they are detected as free in code that is not resource-pool driven. > b) it allows us to make new types of buffer-like headers on par with > existing structures. On this subject, I would like to see the string structure changed to include a buffer header structure, rather than duplicating the fields. This would mean a lot of changes (e.g. all s->bufstart to s->buffer.bufstart), but would be safer and more consistant. Of course, strings may not even warrant existence outside of a generic String pmc any more. > > a) no pmc type morphing. once in a pool, it stays in a pool. I don't see > this as a big loss, since type morphing is error-prone to begin with, imo. The main issue here would be the definition of pmc type, in an untyped language. We may need a PerlScalar pmc type, as that is what most Perl variables really are - if we stick to using pmc types based on current content, then we need to be able to morph between the different subclasses of PerlScalar as the contents change. > > b) data members! Since not all pmcs are the same size, pmcs are able to > store data elements in their structure. This allows us to make a SV-like > PMC which stores str-value, int-value, float-value, etc. All without Okay, you were obviously thinking the same way! > > Thoughts on all of the above? The main drawback that I see is that we can > have a lot more pools. Currently, we don't take advantage of sized header > pools, so making them per-type won't hurt us. However, by making different > pools for different pmc types, an explosion in base pmc types could cause > an explosion in pools and create wasteful memory usage as each pool stores > 'extra' headers for allocation. This can probably be tuned in some form > to reduce over-allocation's affect, but I thought it wise to bring it up. One option would be to use a limited set of physical sizes (only multiples of 16 bytes or something) and have free lists per physical size, rather than per individual pool. This would waste some space in each header, but may be more efficient overall. > > Finallythe unification of buffers and PMCs means that buffers can now > point to things of their own accord, without requiring that they be > surrounded by an accompanying PMC type. How about the other way round? If the one-size-fits-all PMCs were to be replaced by custom structures, then everything could be a PMC, and buffer headers as a separate resource could just disappear! -- Peter Gibbs EmKel Systems
Re: Unifying PMCs and Buffers for GC
Mike Lambert wrote: > At one point, we had a mem_alloc_aligned, which guaranteed the start of a > block of memory given any pointer into the contents of the block. If we > store a pointer to the pool at the beginning of each set of headers, then > we navoid the need for a per-header pool pointer, at the cost of a bit > more math and an additional dereference to get at it. > > - it imposes additional memory requirements in order to align the block of > memory, and imposes a bit more in this 'header header' at the beginning of > the block of headers. I considered this option also, but dismissed it as you need to allocate twice the required size to get guaranteed alignment, so you are better off with the pointer per header. To use this method without the memory overhead would require implementing another allocator: if you want for example a 1K aligned block, first allocate 16K, discard the amount before the alignment point, and dish out the rest as 15 (or 16 if you're really lucky) 1K aligned pages. I seriously considered this when I changed my buffer memory to be paged instead of a single allocation per memory pool; but I haven't actually implemented it yet. -- Peter Gibbs EmKel Systems
Re: [perl #16085] [PATCH] perlundef.pmc
Leopold Toetsch wrote: (via RT) > this one was left over. > Please apply. > > --- perlundef.pmc Sun Jul 28 13:33:34 2002 > +++ /home/lt/src/parrot-007/classes/perlundef.pmc Thu Aug 8 20:11:49 2002 Thanks Leopold; I have applied this, along with the changes to perlint, perlnum and perlstring that I also left out originally. -- Peter Gibbs EmKel Systems
Request for behaviour definition for assignment to PerlScalar
How should the PerlScalar PMC behave in the following situation? new P0, .PerlScalar new P1, .PerlArray assign P0, P1 The assign opcode will call PerlScalar's set_pmc function, which needs to get a value from P1, so it calls PerlArray's X function?? The two obvious options seem to be: vtable method get_scalar(pmc) vtable method get_value(pmc, context) I would prefer the latter, because it is less Perl-centric, and keeps the sizeof the vtable smaller once other contexts are introduced; however, this is slightly slower, as get_value will need to check its context and act accordingly. If context was an enumeration, a simple switch statement would suffice. In this case, it might be interesting to benchmark the consequences of replacing some of the other get_X vtable functions. e.g. get_integer(pmc) -> get_value(pmc, CONTEXT_INTEGER) Comments, anyone? -- Peter Gibbs EmKel Systems
Re: Request for behaviour definition for assignment to PerlScalar
Peter Gibbs wrote: > vtable method get_scalar(pmc) > vtable method get_value(pmc, context) Having broken the rule of posting before drinking coffee in the morning, this is obviously nonsense - neither of these will work, because we don't know the return type of these functions. So, back to the original problem, with no suggested implementation method now! -- Peter Gibbs EmKel Systems
Re: Request for behaviour definition for assignment to PerlScalar
Dan Sugalski wrote: > This: > >new P0, .PerlScalar >new P1, .PerlArray >assign P0, P1 > > isn't code that a compiler should be emitting. (snip alternative compiler output) > Though I freely admit that's a big punt on the real problem, which we > still need to address. > > I'm up for suggestions, since sooner or later someone's going to emit > code like that, and its nice to know what its supposed to do... How about something along the following lines: void get_preferred_type(INTVAL context) { switch (context) { case CONTEXT_SCALAR: return BASETYPE_INTVAL; } } void set_pmc(PMC* value) { switch (value->get_preferred_type(INTERP, value, CONTEXT_SCALAR)) { case BASETYPE_INTVAL: set_integer(INTERP, SELF, value); break; } } -- Peter Gibbs EmKel Systems
Re: set Boolean to 2
Leopold Toetsch wrote: > new P14, .PerlUndef > set P14, 2 > set P10, P14 > An more generally, shouldn't set_p_p call some vtable function? No. set Px, Py simply sets the Px register to point to the same PMC as Py already points to; it does not care about the content of the PMC in any way. The future 'assign Px, Py' will call a vtable function (set_pmc); however, in the quoted example, the pure register level behaviour is all that is intended. -- Peter Gibbs EmKel Systems
Re: set Boolean to 2
Leopold Toetsch wrote: > I currently can't imagine any useful RL example for the current > implementation of set_p_p as $1=$2, _if_ the involved PMC types are > different. You need to differentiate between the PMC itself, and a Parrot register that holds the address of a PMC. In the instruction "set Px, Py" the only PMC involved is the one pointed to by register Py; if Px happens to point to a PMC then that linkage will be broken, with no effect on the PMC (previously) pointed to by Px; the register Px will then point to the same PMC as Py does. In the code snippet you originally posted, there seemed to be no need for this; however, calling subroutines, for example, will require that the addresses of the parameters be placed in specific registers, and so the set operation may well be used frequently in those circumstances. The instruction that performed the actual assignment to the PMC was the "set Px, 2" instruction; this (which may be replaced by "assign") calls the set_integer_native vtable method. Since the PMC type is PerlUndef at that point, the current automatic morphing kicks in, and the PMC actually gets turned into a PerlInt - in future, the perl compiler will have to create a PMC with the correct type initially, and this morphing will be removed for typed variables. -- Peter Gibbs EmKel Systems
[INFO] The first pirate parrot takes to the air
For purely academic purposes, I have re-synchronised some of my forbidden code with the latest CVS version of Parrot. All tests pass without gc debug; and with gc_debug plus Steve Fink's patch. Benchmarks follow, on a 166MHz Pentium running linux 2.2.18. Parrot African Grey life (5000 generations) 172 seconds 81 seconds reverse /dev/null193 seconds 130 seconds hanoi 14 >/dev/null51 seconds 37 seconds The differences between the two versions are: 1) Use of the interpreter cycle-counter instead of stack walking. 2) Linked lists of buffer headers sorted by bufstart 3) COW-supporting code in GC (for all buffer objects) 4) Implementation of COW for string_copy and string_substr Items 1 and 2 use techniques that have been specifically banned, and items 3 and 4 depend on item 2, so none of this code is usable in Parrot (which is why I haven't attached any patches) Some of the changes I made before the memory management code was totally reorganised have not yet been re-integrated. My last version prior to that reorganisation ran 5000 lives in 61 seconds, and I hope to get back to somewhere close to that again. -- Peter Gibbs EmKel Systems
Re: [INFO] The first pirate parrot takes to the air
Matt Fowles wrote: > I am fairly new to this project and this is the first that I have heard of > these things. What are the techniques that have been banned and where can I > find a list of the things one ought not to do? Don't worry too much about it, Matt. It is part of an ongoing game between myself and Dan Sugalski, where I submit patches and he rejects them for various reasons, normally related to performance or memory usage in critical areas within the Parrot core, such as memory management and garbage collection. If you are interested in the history, some references follow. The interpreter cycle counter first arose in the following patch: http:[EMAIL PROTECTED]/msg10218.html and was immediately rejected because it would cause a general performance hit, rather than being targeted directly at the problem areas. The subject of not adding additional fields to the Buffer header structure arose in a thread started by Dan himself, with contributions from Mike Lambert and myself: http:[EMAIL PROTECTED]/msg11424.html The specific message from Dan prohibiting it was: http:[EMAIL PROTECTED]/msg11570.html -- Peter Gibbs EmKel Systems
Re: [INFO] The first pirate parrot takes to the air
_unused_buffers runs can be delayed until we actually need free buffers in a specific resource (sorry, small object) pool. In conjunction with the code I referred to above for freeing buffer headers during a compaction run, this can reduce the frequency of running free_unused_buffers. This requires that buffer_lives be passed the interpreter, so it can access the cycle counter. The other changes I made were to the string subsystem, and I probably won't do them again because they change too many files: a) replace chartype and encoding by a single charset entry This removes one pointer field from the string header, and simplifies all the string functions, as they are forever checking both fields to determine if transcoding is required. I do not believe that the two fields are orthogonal, and therefore the number of charsets would be less than #chartypes * #encodings. b) some alterations to the single vtable thus created, in particular the addition of a find_substring method. -- Peter Gibbs EmKel Systems - Original Message - To: "Peter Gibbs" <[EMAIL PROTECTED]> Cc: "perl6-internals" <[EMAIL PROTECTED]> Sent: 16 August 2002 07:36 Subject: Re: [INFO] The first pirate parrot takes to the air > > For purely academic purposes, I have re-synchronised some of my > > forbidden code with the latest CVS version of Parrot. All tests pass > > without gc debug; and with gc_debug plus Steve Fink's patch. > > Benchmarks follow, on a 166MHz Pentium running linux 2.2.18. > > > > Parrot African Grey > > life (5000 generations) 172 seconds 81 seconds > > reverse /dev/null193 seconds 130 seconds > > hanoi 14 >/dev/null51 seconds 37 seconds > > > > The differences between the two versions are: > > 1) Use of the interpreter cycle-counter instead of stack walking. > > 2) Linked lists of buffer headers sorted by bufstart > > 3) COW-supporting code in GC (for all buffer objects) > > 4) Implementation of COW for string_copy and string_substr > > 1) Yeah, the approach of cycle-counter is a nice one. I also had a similar > solution involving neonate flag usage, somewhere in the archives. Both > have *significant* speed advantages versus the curent codebase's stack > walking. > > I tried to convince Dan of the merit, but they failed for various reasons: > > implement this involves the interpreter recursively calling runops_core to > handle the vtable method. If you increment cycles on the inner loop, you > risk pre-collection of stuff on the stack of the vtable method calling > stuff. If you don't increment cycles, you prevent any of the memory > allocated inside of this vtable method from ever being collected during > the method execution...bad stuff when your vtable methods are multiplying > gigantic matrices or somesuch. > > My neonate buffers solution fails only in the presence of longjmp. > > Granted, we don't do any of this yet, so these solutions will mop the > floor with my current stackwalk code, and pass tests to boot. But it's the > planned introduction of these requirements which are the reason for making > these solutions 'forbidden'. > > One of Nick's solutions was to fallback on stack-walking to handle the > cases where our faster solutions fail. I can definitely see that working > with neonate buffers to handle the fixups needed after a longjmp call. But > it doesn't seem as attractive in the presence of your solution, for which > it would require stackwalking for all re-entrant runops calls. Do you have > another solutioin in mind for handling re-entrant runops calls? > > As far as the extra byte in the buffer, I don't mind that one at all. > There are a lot of restrictions on the GC code in the interest of making > stuff "lightweight". Unfortuantely, GC code takes a significant portion of > the execution time in any realistic application. Hopefully we can convince > Dan to allow extra fields in the buffers in the interest of speed, but I > don't think we can reduce parrot/perl6's feature set in the interest of > speed...might as well use C if that's what you want. :) > > > 3) Definitely a good one. I've been trying to merge your original COW > patch into my code here. Without GC_DEBUG, it fails one test. With > GC_DEBUG, it fails the traditional set plus that one test. The test case > is rather large unfortunately, I haven't been able to narrow down the > problem further or I'd have committed it. > > > > Some of the changes I made before the memory management > > code was totally reorganised have not yet been re-integrated. > > My last version prior to t
Re: [INFO] The first pirate parrot takes to the air
Dan Sugalski wrote: > How much of the speed win is from the cycle count instead of stack > walking? Unless you've solved the problem of recursive interpreter > calls and setjmp, it's not a valid solution, no matter what the speed > win might be. According to my notes the progression (for 5000 lives) was: CVS: 172 seconds Cycle count instead of stack walk: 97 seconds COW with stack walk: 158 seconds Cycle count + COW: 81 seconds The performance gain from COW is fairly small, as it really only results from the decreased number of compaction runs needed. The performance loss from the stack walking code is the major area; what concerns me is that that overhead is going to be incurred all the time, not only when situations arise that might require it - you may recall that your original objection to the cycle count code was based on the same reasoning. Anyway, as I said, this is purely an academic exercise - I am not proposing that any changes be made to Parrot. When features are added to Parrot that cause my current code to fail, I can find an alternative solution, reject those features, or simply abandon the exercise - you do not have those options, so you have to use code that will support everything you intend to do in the future. > Neither 3 nor 4 fundamentally require 1 or 2, I know that Mike is working on implementing COW in a way that will not break anything else; I have just been experimenting with a different approach that is based on point 2. I will be sending my patch to Mike so he can see if there is anything he can use. -- Peter Gibbs EmKel Systems
Re: [INFO] The first pirate parrot takes to the air
Mike Lambert wrote: > Just for fun, can you run Hanoi on CVS versus CVS+COW? > > I just got COW implemented here, and while I get a 17% speedup on > life, I get a 5% loss on hanoi. Since you only posted life, it's a > bit hard to see if the drop on hanoi is just my fault, or the fault of > COW in general. Hanoi is a difficult case because pretty much the only string manipulation it does is repeat, which doesn't benefit from COW, so all you get is the additional overhead in the compaction code without any compensating gains. I have started doing the timings; however, I no longer have a correct COW-only patch, so I need to do a bit of fiddling. I will post the results as soon as I have them. Attached is my current 'African Grey' patch as discussed yesterday. Also, since you mention hanoi, here (in straight code format, I'm too lazy to do a diff) is an experimental version of string_repeat that tries to reduce the number of calls to memcpy. I can't find my notes at the moment as to what benefit it gave - perhaps you might like to try it sometime. (Note that the string_make call will need to be changed to the split chartype/encoding version) -- Peter Gibbs EmKel Systems STRING * string_repeat(struct Parrot_Interp *interpreter, const STRING *s, UINTVAL num) { STRING *dest; UINTVAL i; char *ptr; dest = string_make(interpreter, NULL, s->bufused * num, s->charset, 0); if (num == 0) { return dest; } dest->bufused = s->bufused * num; dest->strlen = s->strlen * num; mem_sys_memcopy(dest->bufstart, s->bufstart, s->bufused); i = 1; num--; ptr = (char *)dest->bufstart + s->bufused; while (num) { mem_sys_memcopy(ptr, dest->bufstart, s->bufused * i); ptr += (s->bufused * i); num -= i; i *= 2; if (i > num) { i = num; } } return dest; } grey1.patch Description: Binary data
Re: [perl #16269] [PATCH] COW...Again and Again
Hi Mike Elapsed times for 'time parrot hanoi.pbc 14 > /dev/null' are: CVS: 52.81, 52.05, 52.33 CVS + grey COW: 51.53, 52.06, 51.67 CVS + Mike's COW: 44.31, 44.48, 44.55 CVS + grey1: 35.89, 36.48, 36.60 (+COW +cyclecount -stackwalk) End June grey: 30.14, 29.35, 29.53 And 5000 generations of life tested again: CVS: 170.22, 169.01, 168.70 CVS + grey COW: 162.65, 161.44, 163.61 CVS + Mike's COW: 156.86, 157.78, 157.67 CVS + grey1: 80.38, 80.74, 80.69 End June grey: 59.21, 59.41, 59.42 CVS 14th July: 81.22, 81.17 (last timings I recorded before stack walking) So I get an improvement on Hanoi of about 15% using your COW patch, and your COW is better on both tests than mine. -- Peter Gibbs EmKel Systems
Re: Stack Walk Speedups?
Mike Lambert wrote: > As Peter has pointed out, our stackwalk code is rather slow. > Anyone feeling adventuresome and want to attempt to speed this up? If you want to get some improvement at the cost of some duplicated code, you can remove the * direction logic, and write two copies of the loop. (patch attached, but not fully tested) On my machine, that changes 5000 lives from 168 seconds to 133 (tested for one run each only) I'm sure there is more that can be done, but that may help for now. -- Peter Gibbs EmKel Systems dod.patch Description: Binary data