Re: [PHP-DEV] PHP 6

Lester Caine Sat, 13 Mar 2010 03:10:07 -0800

Chen Ze wrote:

On Sat, Mar 13, 2010 at 2:34 AM, Derick Rethans<[email protected]>  wrote:

On Fri, 12 Mar 2010, Hannes Magnusson wrote:

On Fri, Mar 12, 2010 at 17:38, Moriyoshi Koizumi<[email protected]>  wrote:

I'd love to see my brand-new mbstring implementation in the release.
Dropping mbstring completely won't be any good because lots of
applications rely on it, but I don't really want to maintain the funky
library bundled with it.


Thats actually one of the ideas we had on IRC.
That mbstring patch and more ext/intl features should be enough to
solve "the unicode problem".


Sorry, but that is not true. intl and mbstring can provide functionality
to deal with UTF 8 string manipulation functions, they can not provide
proper Unicode support. Proper Unicode support is *not* only just
dealing with UTF-8 strings. Proper Unicode support includes dealing with
file streams, with different encodings, with localiztion, with sorting,
with locales, with formatting numbers. Offloading this to extensions
makes Unicode support an add-on hack, and not a language feature. I am
not saying that intl and mbstring aren't *useful*, but they definitely
do not solve "the unicode problem".


I think unicode should only care for string handling. Formatting
numbers should not be the thing that unicode cares. Unicode is a
standard for text, not for text or number formatting.

Back to the days we don't have unicode, the number formatting have
already existed. It even exists when computer was not invented.

That is same for sorting.

When we think about Unicode, we should think about those really
related to Unicode,like file system. Number formatting and sorting are
other things which intl cares.

For the unicode, I think we should implement something like:

$chars=new mchar($bytes,$bytes_encoding);
echo $chars;//output encoding
foreach ($chars as $char) {
       echo $char;//output single utf-16/utf-8 char (depends on default
output encoding)
}
echo $chars->bytes('gbk');

$chars->outputEncoding('gbk');
echo $chars;

ini_set('mchar_output_encoding','gbk');
echo $chars;

ini_set('mchar_filesystem_encoding','gbk');
echo $chars->filepath();


I think this probably highlights the fundamental difference of opinions on 
Unicode?

Handling unicode CONTENT is not the problem here. People nowadays expect to beable to use their own language to write code, and create functions using wordsthat they recognize. In databases, table and field names are now expected tosupport unicode, rather than just handling unicode data pumped into ascii titledfields.

Personally I'm quite happy with just using ascii names for things, but more andmore overseas customers provide contact details in 'strange' character sets thatonly unicode can handle, and handling THAT in PHP5 is not a problem. It's whenpeople start building databases with unicode metadata and expect the toolsinterfacing with that to understand unicode as well.

It was my understanding that PHP6 was intended to provide international userswith something that they could use in their own native language? Unicode titledfiles with unicode titled classes and functions.


--
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk//
Firebird - http://www.firebirdsql.org/index.php

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] PHP 6

Reply via email to