Hi Moriyoshi,

Moriyoshi Koizumi wrote:
Hi,

Wouldn't it suffice to add a field for the hash value and a flag that
indicates its validity to zval instead of appending zend_literal
everywhere?

We used the approach you suggest on the early stages of development, but then realized that passing zend_literal* adds more power with the same cost. Think about classes and method names, which need to be converted to lower case. Especially for them we could pass original name and zend_literal* which represent lower-cased name.

Thanks. Dmitry.

Moriyoshi

On Wed, Mar 24, 2010 at 11:12 PM, Zeev Suraski <z...@zend.com> wrote:
Hi,

Over the last few weeks we've been working on several ideas we had for
performance enhancements. We've managed to make some good progress.  Our
initial tests show roughly 10% speed improvement on real world apps.  On
pure OO code we're seeing as much as 25% improvement (!)

While this still is a work in progress (and not production quality code yet)
we want to get feedback sooner rather than later. The diff (available at
http://bit.ly/aDPTmv) applies cleanly to trunk.  We'd be happy for people to
try it out and send comments.

What does it contain?

1) Constant operands have been moved from being embedded within the opcodes
into a separate literal table. In additional to the zval it contains
pre-calculated hash values for string literals. As result PHP uses less
memory and doesn't have to recalculate hash values for constants at
run-time.

2) Lazy HashTable buckets allocation – we now only allocate the buckets
array when we actually insert data into the hash for the first time.  This
saves both memory and time as many hash tables do not have any data in them.

3) Interned strings (see
<http://en.wikipedia.org/wiki/String_interning>http://en.wikipedia.org/wiki/String_interning).
Most strings known at compile-time are allocated in a single copy with some
additional information (pre-calculated hash value, etc.).  We try to make
most incarnations of a given string point to that same single version,
allowing us to save memory, but more importantly - run comparisons by
comparing pointers instead of comparing strings and avoid redundant hash
value calculations.

A couple of notes:
a.  Not all of the strings are interned - which means that if a pointer
comparison fails, we still go through a string comparison;  But if it
succeeds - it's good enough.
b.  We'd need to add support for this in the bytecode caches. We'd be happy
to work with the various bytecode cache teams to guide how to implement
support so that you do not have to intern on each request.

To get a better feel for what interning actually does, consider the
following examples:

// Lookup for $arr will not calculate a hash value, and will only require a
pointer comparison in most cases
// Lookup for "foo" in $arr will not calculate a hash value, and will only
require a pointer comparison
// The string "foo" will not have to be allocated as a key in the Bucket
// "blah" when assigned doesn't have to be duplicated
$arr[“foo”] = “blah”;

$a = “b”;
if ($a == “b”) { // pointer comparison only
 ...
}

Comments welcome!

Zeev

Patch available at: http://bit.ly/aDPTmv


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to