Russ Allbery writes:
: Yeah, but one of the guarantees of UTF-8 is:
: 
:    -  The octet values FE and FF never appear.
: 
: I can see that this property may not be that important, but it makes me
: feel like things that don't have this property aren't really UTF-8.

Which is one of the reasons I call it "utf8" instead.  I think of utf8
as a nice way to compactly store a sequence of arbitrarily sized
integers.  And you know I've never been particularly interested in
having Perl enforce arbitrary limits.  (Admittedly, in the particular
case of integer size, Perl has historically accepted some arbitrary
2**n limits to gain performance.)

I'm much more interested in the clean abstraction of "a string is a
sequence of integers" than I am in the fact that those integers happen
to represent particular characters under Unicode.  To be sure, it's
quite handy that those integers do represent characters, but (as has
been pointed out redundantly and repetitiously) the definition of
Unicode changes over time.

In contrast, the definition of integers doesn't change.  (At least, it
hadn't changed last time I checked...)

Larry

Reply via email to