On 01/14/2013 01:55 PM, Gustavo Lopes wrote:
On Wed, 09 Jan 2013 23:45:03 +0100, Gustavo Lopes <glo...@nebm.ist.utl.pt>
wrote:
On Thu, 03 Jan 2013 11:40:31 +0100, Gustavo Lopes <glo...@nebm.ist.utl.pt>
wrote:
The algorithm behaves very poorly in this case because at each position of the
text, all the substrings starting there and with size between m and n (where m
is the size of the smallest pattern and n is the largest) are checked, even if
there are only
two patterns with size m and n. We could fix this easily by building a set of
the pattern sizes found and try only with those. The hashing of the substrings
could also be improved; we don't have to recalculate everything when we advance
in the text.
Both optimizations (the hash rolling and limiting the substrings hashed on each
iteration) worked quite well.
But I got much better results with another algorithm [1], so I'm going to merge
the branch with it [2] instead.
OK, so now the plan is to merge this onto 5.4:
https://github.com/cataphract/php-src/compare/php:PHP-5.4...cataphract:strtr_wu94_54
And this to 5.5:
https://github.com/cataphract/php-src/compare/php:PHP-5.5...cataphract:strtr_wu94_55
The difference is that the 5.4 version does not introduce zend_qsort_r() and
instead has dedicated simple heap sort.
Any thoughts on this?
Despite temptation, I'm still erring on the side of merging to 5.5+
only. My argument is based on the potential for regressions (behavior
and performance), added code maintenance issues, merge effort etc.
The ability to differentiate the benefits of PHP 5.5 over 5.4, and
promote migration to 5.5 will also be improved.
Chris
--
christopher.jo...@oracle.com http://twitter.com/ghrd
Newly updated, free PHP & Oracle book:
http://www.oracle.com/technetwork/topics/php/underground-php-oracle-manual-098250.html
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php