Hello again everyone,

Andrea Faulds wrote:
Hello again,

Andrea Faulds wrote:
Hi,

Matthew Brown wrote:
I imagine such a "list" type would be a subtype of "array" – everywhere
that array was accepted, a list would be also, and it would have the same
copy-on-write behaviour.

IIRC with the modern implementation of arrays in the Zend Engine, you can check that an array has no string keys, has keys in the right order, and possibly even that they are all numbered 0 through (count($arr) - 1) without gaps.

Sorry, what I meant to say was “you can check [quickly in O(1) time] that an array has no string keys [etc]”. This is because of the Zend Engine's “packed arrays”.

Thanks,
Andrea

Since it's pretty simple, I decided to write a proof-of-concept implementation of such a check. For simplicity's sake I haven't implemented a type declaration right now, just an is_list() function, but it could easily be extended to one. The implementation can be found at https://github.com/php/php-src/compare/master...hikari-no-yume:is_list (and here's a direct commit link to the current version as of this writing for the sake of historical record: https://github.com/php/php-src/commit/f6fc94fae8d97743d23d276d66071dac1d240a05)

As I hoped, it can in several common cases give a TRUE result in O(1):

  $ sapi/cli/php -r 'var_dump(is_list([]));'
  No elements
  bool(true)

  $ sapi/cli/php -r 'var_dump(is_list([3, 2, 1]));'
  Packed and without holes!
  bool(true)

  $ sapi/cli/php -r '$x = [3,2,1]; sort($x); var_dump(is_list($x));'
  Packed and without holes!
  bool(true)

 $ sapi/cli/php -r '$x = [1,2,3,4]; unset($x[3]); var_dump($x, is_list($x));'
  Packed and without holes!
  bool(true)

Unfortunately, it turns out that it can't always be fast! :(

The problem is that while an array can have the “packed” and “without holes” flags IFF it has no string keys and all keys are in order without gaps, that does not mean the reverse is true: that if an array lacks those flags, it must have string keys, out-of-order keys or gaps. The result is there are several cases where we must foreach over the array to check if it's a “list”, which unfortunately means as bad as O(n) time complexity depending on the array content.

For example we can't be sure if it has string keys without foreaching over it:

  $ sapi/cli/php -r 'var_dump(is_list(["foo"=>"bar","bar"=>"baz"]));'
  Foreaching
  Hit string key
  bool(false)

I want to say that this used to be possible in the common case without foreaching due to some flag on the HashTable, I think Dmitry added that. But it might have been a proposed patch that didn't get in, or it was reverted.

If this is check will only be used as a type declaration on functions/return-types/properties and not as a general-purpose is_list() function, we might decide that the O(n) worst-case is acceptable so long as it only affects cases where the array is not a valid list, i.e. when a TypeError would be thrown, because in production code, values being type-checked should only rarely have the wrong type, so it should only cause slowness during debugging. It is not like we are trying to make throwing type errors fast… at least so I assume?

Unfortunately, this O(n) worst-case applies even to some valid lists. For example:

$ sapi/cli/php -r '$x = [0=>1,2=>3,1=>2]; unset($x[2]); var_dump($x, is_list($x));'
  Foreaching
  No irregularities encountered, true by slowest path
  array(2) {
    [0]=>
    int(1)
    [1]=>
    int(2)
  }
  bool(true)

$ sapi/cli/php -r '$x = [1,2,3]; $x["foo"] = "bar"; unset($x["foo"]); var_dump(is_list($x));'
  Foreaching
  No irregularities encountered, true by slowest path
  bool(true)

Maybe these kinds of cases are uncommon enough that we could deem the slowness acceptable. We could also use list type-checks as an opportunity to optimise valid lists that aren't already “packed” to be “packed” behind-the-scenes, speeding up subsequent checks. Also, it may be the case that the Zend Engine's hashtable implementation could be changed slightly to allow O(1) checks in more cases.

Anyway, I hope this was interesting and can help inform discussion and perhaps provide inspiration!

Thanks,
Andrea

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to