Hi folks,
I am currently developing the next major version of TYPO3, a PHP
based content management system. As we take advantage of the new
Unicode features (among other things) it will be completely based on
and rely on PHP6 with unicode.semantics turned on. (Sidenode: My veto
would be that unicode.semantics would be turned on by default, but I
trust on that you take the right decision and then we'll adapt to
it). We bypassed all backward compatibility issues by just starting
from scratch again ...
Anyway, using the latest CVS version, I stumbled over a behaviour I
didn't expect and I'd like to ask you if that is a wanted behaviour
or a bug.
Consider the following code (unicode semantics = on):
preg_match('/(?P<character>\w),/', 'a,b,c,d', $matches);
echo (isset($matches['character']) ? 'yes ' : 'no ');
The expected output is "yes", but it returns "no". The reason is that
the index "character" in the
array returned by preg_match seemst to be a binary string, not
unicode as I expected:
preg_match('/(?P<character>\w),/', 'a,b,c,d', $matches);
echo (isset($matches[(binary)'character']) ? 'yes ' : 'no ');
echo (isset($matches[(unicode)'character']) ? 'yes ' : 'no ');
This code outputs "yes no".
A bit worse even:
preg_match('/(?P<character>\w),/', 'a,b,c,d', $matches);
echo (isset($matches[(unicode)'character']) ? 'yes ' : 'no ');
extract($matches);
echo(gettype($character) . ' ');
$matches = compact($character);
echo (isset($matches[(unicode)'character']) ? 'yes ' : 'no ');
The output is "no unicode no".
Sorry if I understood something wrong here, but to me it looks like a
little inconsistency.
Cheers,
Robert
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php