Steve Hay wrote: > And this program (500,000 small extensions to a string): > > my $a = ''; > my $start = time; > for my $i (1 .. 500000) { > print "$i\n" if $i % 1000 == 0; > $a .= '.' x 20; > } > printf "OK (%d seconds)\n", time - $start; > > is even worse: 1 second again on 5.8.6/perl-malloc versus 56 > seconds on 5.8.4/system-malloc!
I'm pretty sure realloc() is the culprit here. A common trick used by string classes is to double the memory allocation whenever you need to grow a string, and so avoid O(n-squared) performance when growing a string one char at a time. For cheap thrills, I created the following patch against 5.8.6 to hack this memory doubling trick into Perl_sv_catpvn_flags(): --- sv-orig.c 2004-11-02 03:01:54.000000000 +1100 +++ sv.c 2004-12-16 15:04:39.000000000 +1100 @@ -4394,9 +4394,14 @@ { STRLEN dlen; char *dstr; + STRLEN neededlen; dstr = SvPV_force_flags(dsv, dlen, flags); - SvGROW(dsv, dlen + slen + 1); + neededlen = dlen + slen + 1; + if (SvLEN(dsv) < neededlen) { + STRLEN s2 = SvLEN(dsv) * 2; + SvGROW(dsv, s2 < neededlen ? neededlen : s2); + } if (sstr == dstr) sstr = SvPVX(dsv); Move(sstr, SvPVX(dsv) + dlen, slen, char); Though all tests passed, I don't know Perl internals well enough to judge whether this hacky change is sound or not. I did it only out of curiosity to see how it affected Steve's little benchmark. After applying this patch, the performance of Steve's test program above dropped from 26 seconds to 1 second when run on Windows XP (when built with default system malloc). BTW, on Linux, the improvement was hardly noticeable: from 0.86 secs to 0.51. I'm interested to hear opinions on whether these sort of memory heuristics are best done in the perl core or left to realloc(). /-\ Find local movie times and trailers on Yahoo! Movies. http://au.movies.yahoo.com