On 30 Jul 2014, at 07:50, Tjerk Meesters <tjerk.meest...@gmail.com> wrote:
>> That would make sense, but doesn't solve all edge cases as your maximum array >> index is still more than 2 times the largest positive integer on 32-bit. > > Is that by design, a bug or something else entirely? Could you explain this > edge case with some code? On a 32-bit platform, the maximum signed long is 0x7FFFFFFF, but the maximum unsigned long is 0xFFFFFFFF, slightly more than twice as big. For example, this does what you’d expect on my machine (OS X 64-bit Intel Core i5): andreas-air:~ ajf$ php -r '$x = [0xFFFFFFFF => 1]; $x[] = 2; var_dump($x);' array(2) { [4294967295]=> int(1) [4294967296]=> int(2) } On my 32-bit Ubuntu VM (which I use precisely to test this kind of issue when working on bigints), however, it wraps around: ajf@andrea-VirtualBox:~$ php -r '$x = [0xFFFFFFFF => 1]; $x[] = 2; var_dump($x);' array(2) { [-1]=> int(1) [0]=> int(2) } I think we should probably use an unsigned long internally, but prevent negative values. > Forbidding negative indices is a bit harsh and imho quite unnecessary; Actually, I missed the bit of your email suggesting treating them as strings the first time I read it. I’d be fine with that. > turning “out of range” indices into strings should work just fine afaict. Is > there a reason why it shouldn’t? Well… there is one issue. Basically, some array functions treat integer and string keys completely differently. > A compromise could be to allow string keys that would otherwise have > converted into a negative integer, but disallow negative int/float explicitly. It’d be a complete BC break, but we could make negative indices work like they do in Python and grab the (length + index)th item (i.e. -1 returns item 4 in a list of 5, -2 returns item 3, and so on). However, because our arrays are weird semi-indexed semi-hashmap things, this probably isn’t good, as it’d prevent you from using strings like “-1” as keys. Alas, I can dream. To actually respond to your suggestion, I don’t like the idea of blocking -1 but allowing “-1”. In PHP, numeric strings, integers and floats are supposed to be equivalent, and I’m already unhappy that large integer indexes and large numeric string indexes work differently. Whatever we do, I’d like PHP 7’s arrays to treat integer, float and numeric string indexes consistently. Thinking about it a little more, if we use a long for indexes, we don’t even need to make them strings. It would fit the principle of least astonishment IMO if any valid PHP int is a valid index and won’t be a string. I was going to say that negative indexes don’t work right internally, but then I realised they could work fine for indexing into the buckets if we just cast them to unsigned longs internally (hence getting the 2’s complement representation on modern CPUs) for indexing and hashing, but only expose signed longs to the outside world, including through the API. So in summary, I think we should use signed longs for indexes (or at least whatever type PHP’s basic int is), and anything outside of the range of one should be treated as a string. This would make numeric strings and ints consistent, would solve all the weird overflow issues, and is the most intuitive approach IMO. -- Andrea Faulds http://ajf.me/ -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php