Hello Stanislav, cool, care to change the code snippet into a test as I've done for Rui's snippet?
marcus Sunday, March 23, 2008, 5:06:53 AM, you wrote: >> is broken code and not a single test. If this is not going to change as in >> we are not getting any .phpt files for this feature then there are two > As I understand the theory of the thing should be pretty simple, you set > input encoding (by config or declare) and internal encoding, and then > when script is being read, you convert it from input to internal. > However, it appears that since flex couldn't stomach certain encodings, > there's also a hack there - script is translated from input to some > "safe" encoding for flex, and then strings are translated back to > "internal" encoding after flex processes them. If re2c can deal with > encodings like SJIS without trouble then some of the hacks might be > unnecessary. I think encodings that need to be checked are those in > zend_multibyte.c that have "compatible" flag off. > Here's a short script example I found that shows what's the problem there: > <?php echo 'ソ'; ?> > Character echoed there is U+30BD "Katakana letter SO". Now if you run it > in UTF-8, works good. However, if you recode it to Shift-JIS, it won't > run, since this script looks to the parser this way: > <?php echo '<83>\'; ?> > (that's dump of VI output, so replace <83> with actual 0x83 if you > compose it). That's parse error for the parser, if parsed "naively". So > somehow the parser needs to know 0x83+\ is actually U+30BD and at the > same time the user still might want it as 0x83+\ in a zval (or maybe as > utf-8 - it depends on him). > -- > Stanislav Malyshev, Zend Software Architect > [EMAIL PROTECTED] http://www.zend.com/ > (408)253-8829 MSN: [EMAIL PROTECTED] Best regards, Marcus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php