On Sat, Jan 5, 2013 at 6:58 PM, Nikita Nefedov <inefe...@gmail.com> wrote:
> ** > Though there would be a little speed-up because with Symbols array's > Buckets will keep numeric key, so instead of memcmp you will need to just > compare two longs when retrieving element. > Before memcmping the array keys PHP will first compare the pointers. For interned strings the pointers will be the same so the memcmp is not done. See http://lxr.php.net/xref/PHP_TRUNK/Zend/zend_hash.c#950. > Actually this is looks a little bad now, AFAIK this is what happens when > you trying to receive value from array by string key: > > 1. Calling zend_new_interned_string_int for interning or getting > already interned same string, receiving pointer to the stored string from > it: > 1. Hash the string (O(n)) > 2. Retrieve bucket from arBuckets > 3. Find needed bucket by iterating over all retrieved buckets (over > *pLast) and comparing its keys with memcmp > 4. If found - return pointer to string, else create new bucket... > 2. Now that we have an interned string, we can try to retrieve value > from array with string: > 1. Hash the string again (O(n)) > 2. Retrieve bucket by hash fro arBuckets > 3. And again memcmp used for comparing strings > > The string interning happens at compile time, so for all practical purposes it does not matter much how long it takes. At runtime the hash is already precalculated and the ptrs are the same (see above) so the array access goes through pretty much the fastest possible code-path. I quickly tried out how fast they are compared to numeric keys: http://codepad.viper-7.com/Km9hPz String and integer keys perform pretty much the same there. Nikita