On Fri, 09 Feb 2007 22:27:37 +0100, in php.internals [EMAIL PROTECTED] (Lukas Kahwe Smith) wrote:
>Well I remember that Andrei once brought up the idea of an ereg wrapper >around pcre, but IIRC the idea was dropped because there would be tons >of subtle issues from edge cases. So the decision was made to simply >turn it into a pecl extension to be installed by the user as needed >without shiny new unicode support. Not sure if that means that ereg >would not be made to work with PHP6 at all ... Maybe it would be time to re-check the preg arguments. Currently the preg pattern input is a quagmire of partial perl style and usual arguments. In perl delimiters make sense (e.g. s/foo/bar/i , s_foo_bar_i , s(foo)(bar)i , s<foo><bar>i ). The delimiters are an easy way to separate arguments. But in PHP we are already separating them as ordinary function arguments, though the separation is a bit spurios: s/foo/bar/i would be changed to preg_replace('/foo/i','bar',$input) . So, we separate the replace part, but not the match or the flags. We partially pretend that we are working perl-style, but obviously, we aren't. For instance, "/foo\Q$bar\E/" would not work exactly as /foo\Q$bar\E/ would in perl if $bar contained an \E (preg_match() would just recieve the interpolated string): $ perl -le '$foo = "\\E"; $_ = " \\E "; print m/\Q $foo \E/;' 1 $ php -r '$foo = "\\E"; print preg_match("/\Q $foo \E/"," \\E ");' 0 Furthermore, the use of /e flag also creates some strange results at first sight: <?php $string = <<<EOD A "little" test and a 'little' test EOD; print preg_replace('_(.*)_e',"'$1'",$string)."\n"; print preg_replace('_(.*)_e','"$1"',$string)."\n"; ?> I think it confuses more than it helps that we require the delimisters and the current arguments (preg_*-functions requre match-and-flags as first argument, which itself contains of two arguments; the match and the flags). As mentioned, there is no way \Q and \E could work exactly as in perl. The same goes to the usage of the e flag. There are already alternatives, e.g. preg_quote() and preg_replace_callback() to overcome these shortcomings. But I don't think people would be encouraged to use them without first crashing into the shortcomings of the ordinary usage with \Q, \E and /e. Unfortunately there isn't an easy fix. PHP isn't perl; for the above to work would require major changes in the PHP language parser. The nice way would be to get rid of delimiters and separate the flag into its own argument, but it would be a hell of a BC break if the input form suddently changed. Nonetheless the current PCRE functions leads to confusion and weirdness as long the perl syntax is mixed with php. -- - Peter Brodersen -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php