hi,

Did you forget to attach the patch? Attach it as .txt so the list
won't reject it.

Cheers,

On Fri, Dec 31, 2010 at 3:40 PM, Marcin Babij
<marcin.ba...@nasza-klasa.pl> wrote:
> Hello,
> I work for social network company, where we were running optimization
> project. One of it's results is patch to Zend engine's Hashtable, which we
> want to share and ask you for comments and improvements.
>
> Why we do this?
> We run profiling on our production servers and found out that zend_hash_*
> functions take 10-20% CPU time of request. So there is some room for easy
> improvements.
>
> What was done?
> - Hash function in zend_hash.h was rebuilt and became much faster, without
> losing the most important properties.
> - Hashtable implementation was changed from Simple chaining to Open
> addressing with linear probing, but with linked bucket, not included in hash
> array, which causes:
> -- Bucket structure to lose 2 pointers.
> -- Searching works similar, but don't have to jump with pointers stored in
> different memory locations, inserting, deleting and rehashing don't need to
> update linked list, but must search for first empty bucket, which is fast,
> because it scans continuous memory.
> -- Load factor decreases from 1.0 to 0.5-0.75 to make less collisions and
> faster hashtable, which in turn increases memory footprint a little.
> - Open addressing doesn't change significantly performance, but next thing
> was to create new array (arEmpty), which is of size nTableSize bytes, which
> keeps track of used/empty buckets and makes inserting and rehashing much
> faster. In future it can be tested as bit-array with size of nTableSize/8
> bytes.
> - More macros were added to replace repetitive constructs.
> - New constants were added to allow:
> -- Creating new hashtables of size at least X (where 4 and 8 are
> reasonable), which makes no rehashing and reallocing memory while changing
> size to 2 and then to 4.
> -- For small tables it's better to extend them by a factor of 4 times, not
> 2, to make rehashing cost smaller for most hashtables, of cost of little
> higher memory consumption.
> -- For large tables it's better to have other load factor, closer to 1,
> while for small tables it's better to use load factor closer to 0.5.
> - APC was patched to take changes in Bucket structure into account.
>
> How was it tested?
> It was tested with make test, where one more (comparing to original sources)
> test fails, but it's most probably because
> http://bugs.php.net/bug.php?id=48858 - IMO test is badly constructed (is too
> simple) and any change of hashing function makes it fail. Also it was tested
> on our testing environment and production servers against >30mln requests to
> our site, with 120requests/s at peak on Xeon @ 2.50GHz with 8GB RAM running
> Debian Linux.
>
> What is the gain?
> After tests CPU usage dropped by about 4% -6%.
> Memory footprint goes up just by few percent.
>
> What can be done in future?
> - Make new constants configurable by php.ini.
> - Test if changing arEmpty from byte-array to bit-array helps on
> performance.
> - Tweak default constants' values using some real-live benchmarks.
> - Prove (or modify and prove) hash function to have property, that it has no
> collisions if two keys don't differ on no more than 6 bytes, which will lead
> to memcmp omit first (or last) 6 bytes of key. Also simpler thing may be
> proven, that is it has no collisions if two keys are not longer than 6
> bytes, which will make most string keys omit memcpy at all.
>
> The patch was created and tested against php-5.3.0, apc-3.1.3p1, then merged
> with php-5.3.4, apc-3.1.6 without conflicts, and for these last versions
> patches are attached. Also, it shouldn't conflict with
> http://wiki.php.net/rfc/performanceimprovements .
>
> --
> Marcin Babij
> nasza-klasa.pl
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>



-- 
Pierre

@pierrejoye | http://blog.thepimp.net | http://www.libgd.org

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to