Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Monday, August 03, 2015
Hi Matt,
On Wed, Jul 22, 2015 at 11:16 PM, Matt Wilmas <php_li...@realplain.com>
wrote:
Hi again Dmitry, all,
Hopefully the final update on this, before all is revealed... :-)
[...]
I tried to rush and finish things up before the weekend *2 weeks ago*,
but
it took me too long to get the macros sorted out and working right. :-/
Sorry for the delay, but more and better goodness should now be included.
The extra time allowed me to "relax and take notes" (Notorious B.I.G.),
however. :-D
So yeah, that was all working 10 days ago. Then I realized more function
param data could be packed together which saved another mov
instruction --
so at the call site, it's just mov+lea+call on 64-bit (since execute_data
is already in %rdi). There's nothing else (ignoring checking return
value/return on error, etc.), and each &dest variable is filled in even
though their address isn't taken (thanks to compiler magic). The only
exceptions are FUNC (4 instructions I think) and OBJECT_OF_CLASS and
VARIADIC (1 instruction) types.
Unfortunately (only because I said "same macro syntax," but no big deal),
the syntax had to be changed, from:
ZEND_PARSE_PARAMETERS_START[_EX](...)
Z_PARAM_*(...)
Z_PARAM_*(...)
ZEND_PARSE_PARAMETERS_END[_EX]
to
ZEND_PARSE_PARAMETERS_START[_EX](...)( // Parentheses
Z_PARAM_*(...), // Comma-separated
Z_PARAM_*(...)
) ZEND_PARSE_PARAMETERS_END[_EX]
Errors in nested macros might be very difficult to understand :(
I would prefer not to use nested macros without a significant gain.
Not sure what you mean about errors, unless you're talking about missing a
comma or such...
And those macro calls themselves aren't really nested, just in parentheses
of course.
They are filling [multiple] structs, although that was also the case with
the version using the EXACT current syntax. :-)
Anyway though, it doesn't matter much; not sure what you'll want to do with
all the possibilities I have! And a simple script converts occurrences to
the new syntax for testing (instead of bigger patch).
Significant gain? Nope. :-) I only did that in order to use the "static"
storage specifier in one place, for a pointer to the packed rodata, instead
of filling it at runtime. But I think the file size was the same with or
without static, even though it saved instructions. So not a requirement,
just part of my experiments
Like I said, the BIG neat thing is getting the same optimization (all except
the "static" part) for the *traditional* ZPP. I hadn't touched it since
last message until this week (doing other stuff and too sick ~4 days to do
anything :-/) and wanted to check closer to final code before replying --
but still looks good with GCC so far!
So depending, there's maybe less interest in my smaller FAST_ZPP
implementation... *shrug*
Overall, the *code* size is reduced (vs traditional ZPP), but the file
size isn't (static stuff in rodata or whatever), which was a bit
surprising, although most of these PHP functions don't have many
parameters...
I may just guess, where this static data came from, because I didn't see
the code yet :)
Just "static const" stuff. :-) After the very first attempt, I've wanted to
pack stuff together. Function min/max args and any flags (QUIET/THROW, or
the new METHOD) are in a 4 byte int. (GCC doesn't want to pack them
together in the latest case, but easily fixed.) Then a byte for each
parameter. So, I tried "static const" to eliminate the movb instructions,
that's all.
Just to give an idea, here's the different instructions for atan2() with GCC
4.8 -O2 (after push %rbx, comments mostly for others):
== Tradtional ZPP ==
xor %eax,%eax # ??? align padding?
mov %rsi,%rbx
mov $0x61c4f3,%esi # format string ptr
sub $0x10,%rsp
mov 0x2c(%rdi),%edi # ZEND_NUM_ARGS()
lea 0x8(%rsp),%rcx # &num2
mov %rsp,%rdx # &num1
callq 595670 <zend_parse_parameters>
cmp $0xffffffff,%eax
je 4f7f4f <zif_atan2+0x3f>
movsd 0x8(%rsp),%xmm1
movsd (%rsp),%xmm0
callq 419190 <atan2@plt>
== My macros, "static const" version ==
mov %rsi,%rbx
mov $0x7709f8,%esi # packed static info ptr; execute_data in %rdi
sub $0x20,%rsp # 16 bytes more; each parameter needs 16 bytes stack
mov %rsp,%rdx # &num1 AND &num2, effectively; usually "lea ?,%rdx"
callq 5935d0 <zend_fast_parse_parameters>
test %eax,%eax # shorter than cmp comparing with SUCCESS vs FAILURE
jne 4f6f84 <zif_atan2+0x34>
movsd 0x10(%rsp),%xmm1
movsd (%rsp),%xmm0
callq 419330 <atan2@plt>
== Traditional ZPP, **optimized at compile time** ==
mov $0x2,%eax # ??? max_args, for below
mov %rsi,%rbx
sub $0x30,%rsp
lea 0x10(%rsp),%rdx # &num1, &num2, ..., effectively
mov %rsp,%rsi # packed info, filled by the following
movb $0x2,(%rsp) # min_args
mov %ax,0x1(%rsp) # max_args
movb $0x0,0x3(%rsp) # flags (none)
movb $0x2,0x4(%rsp) # 'd' double type: 2
movb $0x2,0x5(%rsp) # 'd' double type: 2
callq 5935f0 <zend_fast_parse_parameters>
test %eax,%eax
jne 4f6fa8 <zif_atan2+0x58>
movsd 0x20(%rsp),%xmm1
movsd 0x10(%rsp),%xmm0
callq 419330 <atan2@plt>
That (optimizing traditional string ZPP) will be the *equivalent* of 64KB+
of C code (repetition), all reduced to nothing. :-) And more of that should
(will) be packed together. Hopefully this continues, and with other
compilers, on non-Windows anyway.
Don't know about Windows now... Visual Studio 2008 and 2012 (not much
difference) are NOT optimizing away the code (other times it was GCC with
issues). :-/ Not sure why. Of course they don't support the necessary
compound literals anyway, but I was just testing a manual case... I'll have
to try and check 2015 version soon.
Regardless, there will be a fallback function to be called with optimized
runtime string parsing, to be used if compilers don't create optimized code.
I'll be checking more compilers, of course...
Sorry for the delay. Thought I'd have patch for you when you got back!
It's really about "finished" now, but not sure how many more days of final
tweaking and testing till ready for patch. :-)
Thanks. Dmitry.
- Matt
The biggest size savings actually came from the simple initial
optimization of zend_parse_params_none(). Down to almost nothing, much
faster, and saved 4KB on my --disable-all builds.
NEW GOODNESS -- What would of course be nice to have is a big
optimization
of the traditional zend_parse[_method]_parameters[_ex|_throw] to avoid
changing them all. And it seems some people, like Derick, prefer it.
Of course the obvious way I first had in mind weeks ago was to simply
parse its format string faster (once-ish) at runtime, and then feed it to
this new FAST_parse function. Should give at least 2x speedup I figured.
But with this latest implementation, where the function should probably
now
be called parse_parameters_ARRAY instead of fast_parse, it would need a
second pass after parsing the string. Not a huge deal, but...
What would be *really nice* is to have the compiler parse the format
string, at compile time, and use the new system directly. And... that
should be possible!! 8-)
Last week I figured GCC's "statement expressions" [1] could be used,
which
most compilers seem to support, except MSVC. But just over the weekend I
realized an inline function could be used with a compound literal (for
the
varargs), which is also supported in the latest MSVC versions. Awesome!
And again, fear not, ALL the code can be completely removed by the
compiler, leaving only movb instructions instead of lea+mov/push for the
traditional ZPP function call. So, better than my initial
implementation(s), and nearly the same as my final macro version! I was
just testing prototypes of portions with GCC yesterday, which does fine
after adjusting to not generate *horribly stupid* code.
Now to implement it into PHP ASAP! Then I'll save a few more
banches/instructions in the parse function (specialized for common cases;
some useless GCC instructions), comment and clean up my experimental
mess,
and write up some explanation of the changes before sending patch. Oh,
and
I should verify what Clang does with the code as well...
Stay tuned!
[1] https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php