On 01/14/2013 01:55 PM, Gustavo Lopes wrote:
On Wed, 09 Jan 2013 23:45:03 +0100, Gustavo Lopes <glo...@nebm.ist.utl.pt> 
wrote:

On Thu, 03 Jan 2013 11:40:31 +0100, Gustavo Lopes <glo...@nebm.ist.utl.pt> 
wrote:

The algorithm behaves very poorly in this case because at each position of the 
text, all the substrings starting there and with size between m and n (where m 
is the size of the smallest pattern and n is the largest) are checked, even if 
there are only
two patterns with size m and n. We could fix this easily by building a set of 
the pattern sizes found and try only with those. The hashing of the substrings 
could also be improved; we don't have to recalculate everything when we advance 
in the text.


Both optimizations (the hash rolling and limiting the substrings hashed on each 
iteration) worked quite well.

But I got much better results with another algorithm [1], so I'm going to merge 
the branch with it [2] instead.

OK, so now the plan is to merge this onto 5.4:

https://github.com/cataphract/php-src/compare/php:PHP-5.4...cataphract:strtr_wu94_54

And this to 5.5:

https://github.com/cataphract/php-src/compare/php:PHP-5.5...cataphract:strtr_wu94_55

The difference is that the 5.4 version does not introduce zend_qsort_r() and 
instead has dedicated simple heap sort.

Any thoughts on this?


Despite temptation, I'm still erring on the side of merging to 5.5+
only.  My argument is based on the potential for regressions (behavior
and performance), added code maintenance issues, merge effort etc.
The ability to differentiate the benefits of PHP 5.5 over 5.4, and
promote migration to 5.5 will also be improved.

Chris

--
christopher.jo...@oracle.com  http://twitter.com/ghrd
Newly updated, free PHP & Oracle book:
http://www.oracle.com/technetwork/topics/php/underground-php-oracle-manual-098250.html

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to