Larry Wall <[EMAIL PROTECTED]> writes: > Russ Allbery writes: >> Particularly since extending UTF-8 to more than 31 bits requires >> breaking some of the guarantees that UTF-8 makes, unless I'm missing >> how you're encoding the first byte so as not to give it a value of >> 0xFE. > The UTF-16 BOMs, 0xFEFF and 0xFFFE, both turn out to be illegal UTF-8 in > any case, so it doesn't much matter, assuming BOMs are used on UTF-16 > that has to be auto-distinguished from UTF-8. (Doing any kind of > auto-recognition on 16-bit data without BOMs is problematic in any > case.) Yeah, but one of the guarantees of UTF-8 is: - The octet values FE and FF never appear. I can see that this property may not be that important, but it makes me feel like things that don't have this property aren't really UTF-8. -- Russ Allbery ([EMAIL PROTECTED]) <http://www.eyrie.org/~eagle/>
- RE: Should we care much about this Unicode-ish criticism... NeonEdge
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery
- Re: Should we care much about this Unicode-ish criticism... Larry Wall
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery
- Re: Should we care much about this Unicode-ish criticism... Jarkko Hietaniemi
- Re: Should we care much about this Unicode-ish criticism... Dan Sugalski
- Re: Should we care much about this Unicode-ish criticism... Larry Wall
- Re: Should we care much about this Unicode-ish criticism... Larry Wall
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery
- RE: Should we care much about this Unicode-ish criticism... NeonEdge
- Re: Should we care much about this Unicode-ish criticism... Simon Cozens
- RE: Should we care much about this Unicode-ish criticism... NeonEdge
- Re: Should we care much about this Unicode-ish crit... Simon Cozens
- Re: Should we care much about this Unicode-ish ... Simon Cozens
- Re: Should we care much about this Unicode-ish criticism... Dan Sugalski
- Re: Should we care much about this Unicode-ish criticism... David L. Nicol
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery