>> We are working on different code. You have code with some specific
>> character set and you can control all strings.
>
> Tomas, stop arguing on this. As a library maintainer, I agree with you and
> I don't understand where the
> 'killer feature' is (I heard that Yahoo China asked for it, or is it
> because Zend is established in
> Israel, I don't know...), but, now, if people don't switch to PHP 6 (and I
> am sure they won't), it will
> be your fault, because of your supposed FUD ;)

----
/**
 * @param string $string utf8 string
 * @return string html encoded string
 */
function test_convert_utf8ToHtml($string) {
    // removed 0xE0-0xFD decoding

    // decode two byte utf8 characters
    $string = preg_replace("/([\300-\337])([\200-\277])/e",
    "'&#'.((ord('\\1')-192)*64+(ord('\\2')-128)).';'",
    $string);

    // remove broken utf8
    $string = preg_replace("/[\200-\237]|\240|[\241-\377]/",'?',$string);

    return $string;
}
// \u0105\u30A1
$string = 'ąァ';
// expected result 'ą???' or 'ąァ'
echo test_convert_utf8ToHtml($string);
----

Please show how to do this in PHP6 unicode.semantics=on. Without mbstring,
recode or other character set conversion extensions and without
htmlentities() function. Only core functions and pcre extension. Then make
updated function compatible with PHP 5.2.0.

test_convert_utf8ToHtml() is based on code from modular library. I know
that I can split it into PHP5 and PHP6 code, but I can find functions that
are not modulized and can't be replaced with unicode_encode(). For example
MIME Q encoding or 8bit string detection.

-- 
Tomas

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to