Hi,

in every programming language, method calls are expensive. Especially in PHP, which does not spend any effort during compile time to resolve method calls their target (and cannot due to the possibility of lazily loading classes using include w/ variables). I recently did some performance profiling on exactly how slow method calls are compared to other operation such as, for example, incrementing an int (the factor is around seven) and how they compare to compiled languages (the factor lies between 400 and 1400).

Here goes the test:

 $instance->method();

...in different variants, using public, private and protected (the latter are the slowest). On my machine I get about somewhere around 700'000 method calls per second, while C# scores 250'000'000, for example. Your mileage is going to vary.

The difference in these numbers being quite discouraging, I started digging a bit deeper into how method calls are handled by the Zend Engine. Again, let's take the example from above, here's what happens (in zend_vm_def.h and zend_object_handlers.c):

1) Finding the execution target
 a. $instance is a variable, so we have a zval*
 b. if Z_TYPE_P() of this zval is IS_OBJECT, OK.
 c. Z_OBJCE_P() will render the zend_class_entry* ce
 d. method is a zval*, its zval being a IS_STRING
 e. Given ce's function_table, we can lookup the zend_function*
    corresponding to the method entry by its (previously lower-
    cased!) name
 f. If we can't find it and the ce has a __call, go for that,
    else zend_error()

2) Verifying it
  a. If the modifiers are PUBLIC, OK.
  b. If they're private, verify EG(scope) == ce. If they match,
     OK, if not, try for ce->__call, if that doesn't exist, error.
  c. If they're protected, verify instanceof_function(ce, EG(scope))
     If that returns FAILURE, try ce->__call, if that doesn't exist,
     error. If it exists, OK.

3) Insurance
  a. Finally test if the zend_function* found is neither abstract
     nor deprecated.
  b. Test non-static methods aren't called statically, else issue
     a warning (or error, depending on the situation).

4) Execute
  a. Take EX(function_state).function->op_array and zend_execute()
     it.

You can clearly see the checks in #1 and #2 (most of which happens in zend_std_get_method())are quite extensive. Now the idea I developed was to cache this information and I thus came up with the following:

* At [1d], calculate a hash key for the following:
 - method->name
 - ce->name
 - EG(scope) ? EG(scope)->name : ""
 These are the only variables used for verifying scope and
 modifiers, and the verification is always going to yield the
 same result as long as the stay the same.

* Look this up in a hashtable (in generic-speak:
 HashTable<ulong, zend_function*>). If found, return that,
 continue with [1e] otherwise.

* After [2c], store the found zend_function* to the hash.

I was curious how this would affect overall performance, both in synthetic and in real-world situations. The first tests I ran were something along the lines of:

 for ($i= 0; $i < $times; $i++) {
   $instance->method();
 }

...with and without the patch - this gave me a factor of 1.7 to 1.8 (times the PHP I built with the patch was faster)! The real-world situation was running the test suite of an object-oriented PHP framework, taking 1.55 seconds before and 0.91 after. I would call this good, almost doubling the speed. Of course this is nowhere near the factors I mentioned before but I think this has potential. Of course, caching comes at a cost, but by using a numeric key instead of a string I could reduce the overhead to a minimum, the real-world application consuming about 20 KB more memory, which I'd call negligible.

Last but not least I verified I hadn't utterly broken the way PHP works by running the tests from Zend/tests and found no test where failing with the patch that weren't already failing without it (some of them expected, some not).

The simple idea is a ~50 line patch intended for the PHP_5_3 branch and available at the following location:

 http://sitten-polizei.de/php/method-call-cache.diff

It serves its purpose quite well in CLI sapi and would definitive fixing up for it to go into production (parts of it belong to zend_hash.c, and the cache variable needs to be an EG() instead of static).

I'm interested in your opinions and if you think its addition would be worth a try.

- Timm



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to