hi,

Thanks for the patches :)

Can you open a bug report please (and attach the patches to it)? I'm
sure this patch will be updated a couple of times before it reaches
the repository.

Cheers,

On Fri, Dec 31, 2010 at 4:58 PM, Marcin Babij
<marcin.ba...@nasza-klasa.pl> wrote:
> Sorry for no attachments in previous message, I think my attachments weren't
> redirected with message by lists.php.net email confirmation system. I send
> them again, and for sure I attach links to public copy of them over HTTP:
> https://gist.github.com/761094 - php-5.3.4-hashtable-optimization.patch
> https://gist.github.com/761096 - apc-3.1.6-hashtable-optimization.patch
>
>> Hello,
>> I work for social network company, where we were running optimization
>> project. One of it's results is patch to Zend engine's Hashtable, which we
>> want to share and ask you for comments and improvements.
>>
>> Why we do this?
>> We run profiling on our production servers and found out that zend_hash_*
>> functions take 10-20% CPU time of request. So there is some room for easy
>> improvements.
>>
>> What was done?
>> - Hash function in zend_hash.h was rebuilt and became much faster, without
>> losing the most important properties.
>> - Hashtable implementation was changed from Simple chaining to Open
>> addressing with linear probing, but with linked bucket, not included in
>> hash array, which causes:
>> -- Bucket structure to lose 2 pointers.
>> -- Searching works similar, but don't have to jump with pointers stored in
>> different memory locations, inserting, deleting and rehashing don't need
>> to update linked list, but must search for first empty bucket, which is
>> fast, because it scans continuous memory.
>> -- Load factor decreases from 1.0 to 0.5-0.75 to make less collisions and
>> faster hashtable, which in turn increases memory footprint a little.
>> - Open addressing doesn't change significantly performance, but next thing
>> was to create new array (arEmpty), which is of size nTableSize bytes,
>> which keeps track of used/empty buckets and makes inserting and rehashing
>> much faster. In future it can be tested as bit-array with size of
>> nTableSize/8 bytes.
>> - More macros were added to replace repetitive constructs.
>> - New constants were added to allow:
>> -- Creating new hashtables of size at least X (where 4 and 8 are
>> reasonable), which makes no rehashing and reallocing memory while changing
>> size to 2 and then to 4.
>> -- For small tables it's better to extend them by a factor of 4 times, not
>> 2, to make rehashing cost smaller for most hashtables, of cost of little
>> higher memory consumption.
>> -- For large tables it's better to have other load factor, closer to 1,
>> while for small tables it's better to use load factor closer to 0.5.
>> - APC was patched to take changes in Bucket structure into account.
>>
>> How was it tested?
>> It was tested with make test, where one more (comparing to original
>> sources) test fails, but it's most probably because
>> http://bugs.php.net/bug.php?id=48858 - IMO test is badly constructed (is
>> too simple) and any change of hashing function makes it fail. Also it was
>> tested on our testing environment and production servers against >30mln
>> requests to our site, with 120requests/s at peak on Xeon @ 2.50GHz with
>> 8GB RAM running Debian Linux.
>>
>> What is the gain?
>> After tests CPU usage dropped by about 4% -6%.
>> Memory footprint goes up just by few percent.
>>
>> What can be done in future?
>> - Make new constants configurable by php.ini.
>> - Test if changing arEmpty from byte-array to bit-array helps on
>> performance.
>> - Tweak default constants' values using some real-live benchmarks.
>> - Prove (or modify and prove) hash function to have property, that it has
>> no collisions if two keys don't differ on no more than 6 bytes, which will
>> lead to memcmp omit first (or last) 6 bytes of key. Also simpler thing may
>> be proven, that is it has no collisions if two keys are not longer than 6
>> bytes, which will make most string keys omit memcpy at all.
>>
>> The patch was created and tested against php-5.3.0, apc-3.1.3p1, then
>> merged with php-5.3.4, apc-3.1.6 without conflicts, and for these last
>> versions patches are attached. Also, it shouldn't conflict with
>> http://wiki.php.net/rfc/performanceimprovements .
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>



-- 
Pierre

@pierrejoye | http://blog.thepimp.net | http://www.libgd.org

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to