> Your point about writing portable Unicode-friendly code is well taken. > Rasmus and I have chatted a bit here, and we think we can propose some > changes that may make it easier. sorry, i can hardly found the thread. can u give me sone hint on the subject so i can search it? > > With unicode_semantics=off: > * (unicode) cast converts binary strings to Unicode strings using > runtime_encoding setting > * (string) converts Unicode strings to binary strings using > runtime_encoding again > * Binary and Unicode strings cannot be concatenated. You have to cast > all operands to the same type. > > With unicode_semantics=on: > * (unicode) cast converts binary strings to Unicode strings. The issue > here is whether to use script_encoding (in case you do (unicode)b"blah") i don't thinik if good to write such code nor to speed it up by converting it in compile time > runtime_encoding (in case it's a binary string > that came from elsewhere) > * (string) converts Unicode strings to binary strings using > runtime_encoding setting > * Binary and Unicode strings cannot be concatenated. You have to cast > all operands to the same type. > looks good. but not allowing $binary . $unicode makes some problem with the old code in index.php: require_once($_SERVER["MY_PROJECT_DIR"] . "/lib.php"); where $_SERVER["MY_PROJECT_DIR"] is import from httpd(such as apache) mod_setenv one have to modify it to: require_once($_SERVER["MY_PROJECT_DIR"] . b"/lib.php"); and such code cannot even parsed under <php6 or use "declare" for even 1 string. declare (encodig="binary") { require_once($_SERVER["MY_PROJECT_DIR"] . "/lib.php"); } > > I would *love* a pragma setting like the declare(encoding="UTF-8") to > > say "I'm > > going to use Unicode string literals in this file, whatever > > unicode_semantics > > may be." Would there be any interest in supporting a mode like this? able to declare for binary too... unicode_semantics is much less useful/harmless if most of the script files have declare at the top of the code :)
i wonder what the world will be with the following code if there's no implicit(auto) cast: function test($a, $b = "me") { return "$a is a friend of $b"; } $x = u"x"; $y = b"y"; test($x); test($y); i see no reason not to allow $binary . $unicode, except for performance (maybe there was in the discussion thread). it's better to use a E_STRICT or profiler etc to tell u that a implicit cast is occur for performance only.