Hi everyone,
A longstanding, documented issue in PHP is that casts between objects
and arrays leave numeric keys inaccessible. Specifically, numeric string
properties on objects become inaccessible string keys in arrays, and
integer keys in arrays become inacessible integer properties on objects.
This is due to the Zend Engine's internal HashTable structure supporting
both integer and string keys, but objects and arrays using it
differently. Objects look for string keys, never integer keys, whereas
arrays look for string keys only where a string is non-numeric, and
integer keys in all other cases. This means that a naïve copy of a
HashTable from an array to an object or vice-versa can result in having
keys that cannot be accessed (or indeed deleted) from userland.
Fixing this issue could be done in one of two ways. The first would be
to change how objects (or arrays) perform HashTable lookups internally.
This would require extensive code changes, and could hurt performance
for object property lookups (performing numeric string coercions is
unhelpful given object property names are rarely numeric strings).
Because of this, such an approach unlikely to be pursued.
The other approach is to make the object and array casting operations
less naïve, that is, making them convert numeric string keys to integers
when converting to an array, and convert integer keys to numeric strings
when converting to an object. This is simple to implement as only the
code for casting needs changing, and the conversion is not complex.
However, this has the danger of significantly reducing the performance
of object to array and array to object casts, and this performance
concern seems to have been one reason why fixing this issue has not
happened so far.
This needn't be the case, though. Therefore, I have written a patch that
would fix this issue, and have tried to design it to minimise the
performance impact in the common case. You can find the pull request
here, along with a link to the simple benchmark I have done:
https://github.com/php/php-src/pull/2142
Specifically, it limits the performance impact by first checking whether
an object or array contains any numeric string properties or integer
keys, respectively. If it does not, it simply uses the normal, naïve
copy operation (or performs no copy at all, if possible), which is very
fast. On the other hand, if the object or array does contain numeric
string properties or integer keys, it will take the slow path and
perform the required conversions. This way, the common case (object to
array and array to object conversions with only non-numeric
properties/keys) remains fast, although slightly (between 7% and 21%,
depending on the specific operation) slower than before.
This pull request also affects get_object_vars(), as it has the same
issue with inaccessible keys.
I'd appreciate any feedback on this. Particularly, I wonder if it is too
late to fix this in PHP 7.1, and whether it should go into PHP 7.2. In
the former case, it is worth considering that this changes documented,
if perhaps unintended, behaviour and is therefore arguably a
backwards-compatibility break.
Also, does this need an RFC? It is a bug fix.
Thanks!
--
Andrea Faulds
https://ajf.me/
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php