Indeed it is not meant to work with UNICODE, but as an optimization for PHP5. I tried a lot of examples in ISO-8859-2, and ISO-8859-10 as these are the most interesting for me - and it worked fine. If it really causes problems while original implementation works than details are appreciated.
Michal On Wed, Jun 11, 2008 at 9:01 AM, Texin, Tex <[EMAIL PROTECTED]> wrote: > When I looked at the code, I assumed that it wasn't intended for > international use > I'll have to go back and look to give you details, but it doesn't work for > international use or unicode. > It would be ok for 8859-1. > > > -----Original Message----- > > From: Scott MacVicar [mailto:[EMAIL PROTECTED] > > Sent: Monday, June 09, 2008 6:58 AM > > To: Nuno Lopes > > Cc: internals@lists.php.net; Michal Dziemianko > > Subject: Re: [PHP-DEV] Algorithm Optimizations - string search > > > > There is rabin-karp too but its worse case is O(nm) so that > > might not be ideal, perhaps we should try to compare all of them. > > > > Scott > > > > Nuno Lopes wrote: > > > Hi, > > > > > > So some comments: > > > - you have some problems with the indentation. We only use tabs, so > > > please stick to that. Also, there are some lines that are > > not indented > > > correctly > > > - Have you considered the Boyer-Moore algorithm? I think > > it's a little > > > faster than KMP (take a look at e.g. > > > http://www.cs.utexas.edu/users/moore/best-ideas/string-searching/) > > > - please remove the //TUTAJ SKONCZYLEM comment > > > - revert this change (as well as a few other that are similar): > > > - for (r_end = r + Z_STRLEN_P(return_value) - 1; r < > > r_end; ) { > > > + for ( r_end = r + Z_STRLEN_P( return_value ) - 1; r > > < r_end; > > > ){ (we like small diffs, not long diffs with changes that > > also break > > > our coding standards. e.g. we don't use space after the '(' char. > > > Philip wrote a nice article about diffs at > > > http://wiki.php.net/doc/articles/whitespace) > > > - in strrpos_reverse_kmp() I think you allocate 4 bytes > > less that you > > > want > > > - I think you've too many comments.. We don't need 1 > > comment per line > > > :) > > > > > > After fixing all these points and after running the test suite (with > > > valgrind) and make sure there are no regressions, I think it's safe > > > for you to commit. Still, I would like to see some > > performance figures > > > comparing the KMP and the Boyer-Moore (or point me some > > papers about > > > the subject). > > > > > > Thanks for your work and good luck for the rest of the SoC > > project :) > > > > > > Nuno > > > > > > > > > ----- Original Message ----- From: "Michal Dziemianko" > > > <[EMAIL PROTECTED]> > > > To: <internals@lists.php.net> > > > Sent: Monday, June 09, 2008 12:39 PM > > > Subject: [PHP-DEV] Algorithm Optimizations - string search > > > > > > > > >> Hello, > > >> Here: http://212.85.117.53/DIFF.txt is small patch that > > will speed > > >> up following functions: > > >> strpos, > > >> stripos, > > >> strrpos > > >> strripos, > > >> and probably some others (all that use zend_memnstr/php_memnstr > > >> function) > > >> > > >> The speedup of zend_memnstr is about 8% on average (about > > 30% in case > > >> of artificial strings). > > >> Functions strrpos and strripos are about 25% faster on average. > > >> > > >> The only drawback - it needs some additional space (size > > of the needle). > > >> > > >> All functions pass all the tests. > > >> > > >> If it looks fine than I will apply for cvs account. > > >> > > >> Cheers, > > >> Michal > > > > > > > > > > -- > > PHP Internals - PHP Runtime Development Mailing List To > > unsubscribe, visit: http://www.php.net/unsub.php > > > > > -- -------------------------------------------------------------- I love MacOS X.................................................... --------------------------------------------------------------