Re: [perl #19179] [PATCH] creating string_max_bytes()

Dan Sugalski Mon, 16 Dec 2002 16:17:10 -0800

At 1:50 PM +0000 12/16/02, Nicholas Clark wrote:

On Mon, Dec 16, 2002 at 01:07:36PM +0000, mcharity @ vendian. org wrote:
This question is actually  independent of the patch (which looks good)
simply returns the C<INTVAL> it is passed; C<string_utf8_max_bytes>, on the
other hand, returns three times the value that it is passed because a
UTF8 character may occupy up to three bytes.
Should that really be the number 3? I thought that the UTF8 representation of
code points outside the base Unicode plane could get longer than that.

I think it should be at least 4, potentially 6. Looks like the Unicode consortium's given up and admitted that they're going to potentially use the entire 32-bit space at some point. (Or so my recent run-through of their online stuff seemed to indicate)
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
[EMAIL PROTECTED] have teddy bears and even
teddy bears get drunk

Re: [perl #19179] [PATCH] creating string_max_bytes()

Reply via email to