Steve Hay wrote:
> And this program (500,000 small extensions to a string):
>
> my $a = '';
> my $start = time;
> for my $i (1 .. 500000) {
>   print "$i\n" if $i % 1000 == 0;
>   $a .= '.' x 20;
> }
> printf "OK (%d seconds)\n", time - $start;
>
> is even worse: 1 second again on 5.8.6/perl-malloc versus 56
> seconds on 5.8.4/system-malloc!

I'm pretty sure realloc() is the culprit here.
A common trick used by string classes is to double the memory
allocation whenever you need to grow a string, and so avoid
O(n-squared) performance when growing a string one char at a time.

For cheap thrills, I created the following patch against 5.8.6
to hack this memory doubling trick into Perl_sv_catpvn_flags():

--- sv-orig.c   2004-11-02 03:01:54.000000000 +1100
+++ sv.c        2004-12-16 15:04:39.000000000 +1100
@@ -4394,9 +4394,14 @@
 {
     STRLEN dlen;
     char *dstr;
+    STRLEN neededlen;
 
     dstr = SvPV_force_flags(dsv, dlen, flags);
-    SvGROW(dsv, dlen + slen + 1);
+    neededlen = dlen + slen + 1;
+    if (SvLEN(dsv) < neededlen) {
+        STRLEN s2 = SvLEN(dsv) * 2;
+        SvGROW(dsv, s2 < neededlen ? neededlen : s2);
+    }
     if (sstr == dstr)
        sstr = SvPVX(dsv);
     Move(sstr, SvPVX(dsv) + dlen, slen, char);

Though all tests passed, I don't know Perl internals well enough to
judge whether this hacky change is sound or not. I did it only out
of curiosity to see how it affected Steve's little benchmark.

After applying this patch, the performance of Steve's test program
above dropped from 26 seconds to 1 second when run on Windows XP
(when built with default system malloc). BTW, on Linux, the
improvement was hardly noticeable: from 0.86 secs to 0.51.

I'm interested to hear opinions on whether these sort of memory
heuristics are best done in the perl core or left to realloc().

/-\


Find local movie times and trailers on Yahoo! Movies.
http://au.movies.yahoo.com

Reply via email to