Hey all,

I was debugging a "bug" (quotes because of the extreme edge-case) where
certain cases can cause a stackoverflow internally in the GC when dealing
with extremely deep arrays. For example: this code is not an infinite loop,
but it's so practically big that it's basically an infinite loop for our
purposes.

$a = array(&$a);
function fill (&$a, $i = 3) {
    if (!$i--) return;
    foreach ($a as &$tmp) {
        $tmp = array($tmp, &$a);
        fill($tmp, $i);
    }
}
fill($a);

When you run that on current master, you get a segfault stemming from the
zval_mark_grey() GC function.

This may or may not be "desired" (as you're basically creating a structure
too big to be garbage collected). However, I started playing around with
zval_mark_gray (and the 3 other functions similarly constructed) and tried
re-writing them in a non-recursive manner.

Basically, I'm using a time-memory tradeoff here. By caching the pointers
instead of recursing, I trade a little bit of memory (defaults to blocks of
256*sizeof(**zval)) for decreased recursion (and hence significantly less
stack usage). This tradeoff is a significant win on all fronts for deep
arrays (since it only uses sizeof(**zval) per entry, as opposed to a full
stack frame), but can be a memory loss for wide arrays...

Here's my branch with the changes:

https://github.com/ircmaxell/php-src/compare/zval_mark_grey_tail_recursion

So I decided to clean up the changes, and start benchmarking it. To my
surprise, the results were fairly significant.

First, I modified the script from above to instead have a counter to kill
the script after 200k function calls (fits into about 120mb of memory).

Then, I wrote a benchmark script to execute bench.php and micro_bench.php
multiple times (10 to be exact), and then put out the min, max and average
for each test under this modified version, and stock master. Then it
computed the difference in percentage. I'll link to the gist of the full
results, but here are the 3 summaries (Total):

Full: Total
 Core:     2.094 - 2.1429 - 2.278
 Modified: 2.089 - 2.1323 - 2.208
 Diff: 0.23878% - 0.46532% - 3.07287

Micro: Total
 Core:     9.309 - 9.4995 - 9.878
 Modified: 8.62 - 8.8179 - 9.234
 Diff: 7.40144% - 6.90018% - 6.51954

GC
 Core:     0.5937340259552 - 0.60687265396118 - 0.62824201583862
 Modified: 0.43920111656189 - 0.44599950313568 - 0.47265577316284
 Diff: 26.02730% - 25.60688% - 24.76534

As you can see, that's not a shabby gain in the micro bench (average of 10
runs is a 6.9% gain), and pretty much a gain all around (on average).

Here's the full results (including test scripts, etc):
https://gist.github.com/ircmaxell/5782139

Also note that PHP was compiled in both cases with: ./configure
--disable-all --enable-cli --disable-cgi

I also ran a few tests against the Symfony2 framework (stock testcase), and
it pretty much showed a 1 to 3 second improvement over stock (for a 150
second test).

So not a huge win overall, but can be significant in cases...

What do you think? Is this worth pursuing further? is there a case I'm
missing, or some other reason we shouldn't do this?

Thanks,

Anthony

Reply via email to