Re: Question about list context for String.chars

Mark Reed Mon, 11 Apr 2005 12:54:37 -0700

On 2005-04-11 15:40, "gcomnz" <[EMAIL PROTECTED]> wrote:
>


"日本語".chars would return <[EMAIL PROTECTED]@語>, which can probably be 
expressed
with UTF8?

The string "日本語" is probably represented internally as UTF-8, but that
should have no effect on what .chars returns, which should, indeed, be <日　
[EMAIL PROTECTED]>, that is, an array whose elements are strings which each 
represent
one Unicode code point – irrespective of encoding.

I think that, in general, at the level of Perl code, 1 “character” should be
one code point, and any higher-level support for combining and splitting
should be outside the core, in Unicode::Whatever.

Re: Question about list context for String.chars

Reply via email to