Hi Benjamin,

I think this is great idea!

Let me check that can we get from this, and if we may add this into PHP-7.1 (it 
may be to late).


Thanks. Dmitry.

________________________________
From: Benjamin Coutu <ben.co...@zeyos.com>
Sent: Wednesday, October 19, 2016 1:45:00 PM
To: Xinchen Hui; Dmitry Stogov; Nikita Popov
Cc: PHP Internals
Subject: [PHP-DEV] Exploit fully packed array/hash property

Hello everyone,

I've identified a few more array/hash use cases where it might make sense to 
introduce special short circuit logic for packed arrays.

Specifically, there is an additional property of certain packed arrays (apart 
from being packed obviously) that we can utilize: A packed array with nNumUsed 
equal to nNumOfElements must be consecutively indexed from zero onward without 
any gaps (no IS_UNDEF). Let's call this special pure form of a packed array a 
"fully packed array".

Earlier I have posted on how this convenient property could speed things up for 
array_slice, even if preserved_keys=true (for the offset=0 case), see 
https://marc.info/?l=php-internals&m=147048569215717&w=2

I therefore propose to introduce the following two new macros in 
/Zend/zend_types.h in order to make it easier for developers to introduce 
special packed (or fully packed) array logic:

#define HT_IS_PACKED(ht) ((ht)->u.flags & HASH_FLAG_PACKED)
#define HT_IS_FULLY_PACKED(ht) (HT_IS_PACKED(HT) && (ht)->nNumUsed == 
(ht)->nNumOfElements)

With this we can easily speed things up (and more importantly preserve the 
packed array characteristics) for the common case of array_slice($packed_array, 
$offset=0, $length, $preserved_keys=true) by using the follwing snippet for 
https://github.com/php/php-src/blob/master/ext/standard/array.c#L2906 :

if (preserve_keys ? (offset == 0 && HT_IS_FULLY_PACKED(Z_ARRVAL_P(input))) : 
HT_IS_PACKED(Z_ARRVAL_P(input)))
  ... ZEND_HASH_FILL_PACKED ...

Another example where this could come in handy involves the encoding of JSON. 
For every array that it encodes, json_encode must first detemine wheter the 
array contains string keys or is not consecutively indexed, so that it can 
decide wheter to use a JSON object or a plain JSON array. That is accomplished 
through php_json_determine_array_type in /ext/json/json_encoder.c. That code 
currently has to iterate through the array until it finds a string key or a 
non-consecutive index (which in the worst case would mean iterating through the 
entire array).

Now, for fully packed arrays we can short circuit things and jump directly to 
the conclusion of it being a real plain old array, if it is packed and we can 
prove it has no gaps. That can of course easily be done via the new 
HT_IS_FULLY_PACKED macro. We can improve this code for a large amount of arrays 
with the following simple snippet for 
https://github.com/php/php-src/blob/master/ext/json/json_encoder.c#L38 :

if (HT_IS_FULLY_PACKED(myht))
  return PHP_JSON_OUTPUT_ARRAY;

I'll be happy to make an effort to screen the entire code base in search for 
more cases where this could be useful once this is picked up by a lead core 
developer (maybe Xinchen?) who is willing to commit something on the lines of 
the above.

Any thoughts?

Benjamin

--

Bejamin Coutu
ben.co...@zeyos.com

ZeyOS, Inc.
http://www.zeyos.com

Reply via email to