Hi folks,

I am currently developing the next major version of TYPO3, a PHP based content management system. As we take advantage of the new Unicode features (among other things) it will be completely based on and rely on PHP6 with unicode.semantics turned on. (Sidenode: My veto would be that unicode.semantics would be turned on by default, but I trust on that you take the right decision and then we'll adapt to it). We bypassed all backward compatibility issues by just starting from scratch again ...

Anyway, using the latest CVS version, I stumbled over a behaviour I didn't expect and I'd like to ask you if that is a wanted behaviour or a bug.

Consider the following code (unicode semantics = on):

   preg_match('/(?P<character>\w),/', 'a,b,c,d', $matches);
   echo (isset($matches['character']) ? 'yes ' : 'no ');

The expected output is "yes", but it returns "no". The reason is that the index "character" in the array returned by preg_match seemst to be a binary string, not unicode as I expected:

   preg_match('/(?P<character>\w),/', 'a,b,c,d', $matches);
   echo (isset($matches[(binary)'character']) ? 'yes ' : 'no ');
   echo (isset($matches[(unicode)'character']) ? 'yes ' : 'no ');

This code outputs "yes no".

A bit worse even:

preg_match('/(?P<character>\w),/', 'a,b,c,d', $matches);
echo (isset($matches[(unicode)'character']) ? 'yes ' : 'no ');
extract($matches);
echo(gettype($character) . ' ');
$matches = compact($character);
echo (isset($matches[(unicode)'character']) ? 'yes ' : 'no ');

The output is "no unicode no".

Sorry if I understood something wrong here, but to me it looks like a little inconsistency.

Cheers,
Robert

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to