Hey all, I was debugging a "bug" (quotes because of the extreme edge-case) where certain cases can cause a stackoverflow internally in the GC when dealing with extremely deep arrays. For example: this code is not an infinite loop, but it's so practically big that it's basically an infinite loop for our purposes.
$a = array(&$a); function fill (&$a, $i = 3) { if (!$i--) return; foreach ($a as &$tmp) { $tmp = array($tmp, &$a); fill($tmp, $i); } } fill($a); When you run that on current master, you get a segfault stemming from the zval_mark_grey() GC function. This may or may not be "desired" (as you're basically creating a structure too big to be garbage collected). However, I started playing around with zval_mark_gray (and the 3 other functions similarly constructed) and tried re-writing them in a non-recursive manner. Basically, I'm using a time-memory tradeoff here. By caching the pointers instead of recursing, I trade a little bit of memory (defaults to blocks of 256*sizeof(**zval)) for decreased recursion (and hence significantly less stack usage). This tradeoff is a significant win on all fronts for deep arrays (since it only uses sizeof(**zval) per entry, as opposed to a full stack frame), but can be a memory loss for wide arrays... Here's my branch with the changes: https://github.com/ircmaxell/php-src/compare/zval_mark_grey_tail_recursion So I decided to clean up the changes, and start benchmarking it. To my surprise, the results were fairly significant. First, I modified the script from above to instead have a counter to kill the script after 200k function calls (fits into about 120mb of memory). Then, I wrote a benchmark script to execute bench.php and micro_bench.php multiple times (10 to be exact), and then put out the min, max and average for each test under this modified version, and stock master. Then it computed the difference in percentage. I'll link to the gist of the full results, but here are the 3 summaries (Total): Full: Total Core: 2.094 - 2.1429 - 2.278 Modified: 2.089 - 2.1323 - 2.208 Diff: 0.23878% - 0.46532% - 3.07287 Micro: Total Core: 9.309 - 9.4995 - 9.878 Modified: 8.62 - 8.8179 - 9.234 Diff: 7.40144% - 6.90018% - 6.51954 GC Core: 0.5937340259552 - 0.60687265396118 - 0.62824201583862 Modified: 0.43920111656189 - 0.44599950313568 - 0.47265577316284 Diff: 26.02730% - 25.60688% - 24.76534 As you can see, that's not a shabby gain in the micro bench (average of 10 runs is a 6.9% gain), and pretty much a gain all around (on average). Here's the full results (including test scripts, etc): https://gist.github.com/ircmaxell/5782139 Also note that PHP was compiled in both cases with: ./configure --disable-all --enable-cli --disable-cgi I also ran a few tests against the Symfony2 framework (stock testcase), and it pretty much showed a 1 to 3 second improvement over stock (for a 150 second test). So not a huge win overall, but can be significant in cases... What do you think? Is this worth pursuing further? is there a case I'm missing, or some other reason we shouldn't do this? Thanks, Anthony