Hi again,

Hmm, not a single reply about this patch...?  Did anyone try it out? :-)
Think it can be used after 5.2.2?


Matt


----- Original Message -----
From: "Matt Wilmas"
Sent: Thursday, April 12, 2007
Subject: [PHP-DEV] [PATCH] Major optimization for heredocs/interpolated
strings

> Hi all,
>
> I think I first realized that PHP's scanner splits non-constant strings
into
> many "pieces" after reading Sara's "How long is a piece of string?" blog
> entry[1] last summer.  At the time I didn't know much about the internals
> and didn't know if anything could be done to change it.  Then in the fall
I
> finally took a look at the scanner ;-) and thought it would be possible to
> only "split" strings at variables.  Finally a few months ago, I began
> working out the changes -- it was working almost 2 months ago, but then I
> got sidetracked :-/ from doing some more testing and making a few semantic
> token changes till now.
>
> So anyway, now heredocs and interpolated strings should be pretty much
just
> like constant strings and concatenation (except for the extra INIT_STRING
> opcode).  They scan/parse/compile faster (with less memory), run faster,
and
> there's less to free when destroying opcodes.
>
> With a simple string like "This is $var string" (say $var = 'some'), I
found
> the compile/cleanup time to be up to 50% faster, and runtime 55% faster!
> (Note: To test compile time, I eval()'d about 50 of them in an if (0)
{...}
> block.)  The difference will be *much more* depending on how many "pieces"
> there would've been before (e.g. longer).
>
> The more complex rules increased the size of Flex's tables about 40%.
> However, removing the old heredoc end rule, which used the ^
> beginning-of-line operator, made the YY_RULE_SETUP macro be empty, saving
> some space.  The net result was an 8K/12K larger binary in 5.2/HEAD.  I
was
> surprised at the overall performance increase without the ^ rule.  Its
> saving a few operations per match made just about as much difference as
> Flex's -Cfe table compression (was playing with that first :^)) when
> compiling the code from Zend/bench.php (5% I think).
>
> This was with a Windows ZTS build.  Running ApacheBench on a few different
> scripts showed pretty nice overall improvements -- 10-15% was common in my
> quick tests.
>
> BTW, removing that ^ rule lifts the requirement that the character before
> the closing heredoc label "must be a newline as defined by your operating
> system," to quote the manual.
>
> Now some of the other changes:
>
> The ST_SINGLE_QUOTE state was removed from 5.2, like in HEAD.
>
> A string like "$$$" is considered constant now, since that's really what
it
> is, right?
>
> CG(zend_lineno) wasn't incremented before if a \n or \r newline (not \r\n)
> followed a backslash in a non-constant string.  \{ returned T_STRING
instead
> of T_BAD_CHARACTER like any other invalid escape sequence.  (Note: Of
course
> these won't usually match now anyway, but will be part of a longer
string.)
>
> I removed HANDLE_NEWLINES() from the code that scans a string's text,
> instead doing the newline check in the escape-checking loop, to prevent
> scanning twice.  And I removed the additional boundary check in
> HANDLE_NEWLINES() and elsewhere since I didn't see the need -- AFAIK in
all
> cases you'll only hit '\0'.
>
> I removed the one <<EOF>> rule since it was missing some states and it
> wasn't doing anything that the default EOF rule doesn't by calling
> yyterminate().
>
> In zendlex(), the goto target doesn't need to recheck CG(increment_lineno)
> since it hasn't changed, and I simplified the closing tag newline check
> (also looked like it would miss \r ones).
>
> Sorry for the long message!  I'll send another if I think of something I
> forgot to mention.  Here are the patches:
>
> http://realplain.com/php/scanner_optimizations.diff
> http://realplain.com/php/scanner_optimizations_5_2.diff
>
> Appreciate any feedback, or questions about any of it. :-)
>
>
> Thanks,
> Matt
>
> [1]
>
http://blog.libssh2.org/index.php?/archives/28-How-long-is-a-piece-of-string.html

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to