Well, hard to grep for '3' in a big codebase. They should have used UTFMAX in 
the first place.

I see though that currently in libc.h I have

enum
{
        UTFmax          = 4,            /* maximum bytes per rune */
        Runesync        = 0x80,         /* cannot represent part of a UTF 
sequence (<) */
        Runeself        = 0x80,         /* rune and UTF sequences are the same 
(<) */
        Runeerror       = 0xFFFD,       /* decoding error in UTF */
        Runemax         = 0x10FFFF,     /* 21-bit rune */
        Runemask        = 0x1FFFFF,     /* bits used by runes (see grep) */
};

so Runemax seems to indicate we never produce rune using more than 3 bytes no?
So maybe buf[3] is safe?

On Jun 18, 2014, at 10:36 AM, erik quanstrom <quans...@quanstro.net> wrote:

> On Wed Jun 18 13:36:09 EDT 2014, mirtchov...@gmail.com wrote:
>> used to be 3 :)
>> 
>> "UTFmax, defined as 3 in <libc.h>, is the maximum number of bytes
>> required to represent a rune."
> 
> which is exactly why this should have been caught.
> this one's my fault.
> 
> - erik
> 


Reply via email to