And I do think people would rebel at using Latin-1 for that one.
I get enough grief for Â...Â.  :-)

I can imagine that these cause some trouble with people using a charset other than ISO-8859-1 (Latin-1) that works well with 8 bit, like Greek, Arabic, Cyrillic and Hebrew.

For these guys Unicode is not so attractive, because it kind of doubles the size
of their files, so I would assume that they tend to do a lot of stuff with their
koi-8 or with some ISO-8859-x not containing the desired character.  For ÂÂ it
might not be such a problem, because <<>> would work instead.

Maybe this issue could (will?) be addressed by declaring the charset in the
source and using something like (or better than) \u00AB for stuff that this
charset does not have, using a charset-conversion to unicode while parsing
the source.  This looks somewhat cleaner to me than just pretending a source
file written in ISO-8859-7 (Greek) were ISO-8859-1 (Latin-1), relying on the
assumption that the two characters we use above 0x80 happen to be in
the same positions 0xab and 0xbb.

Sorry if that is an old story...

Best regards,

Karl



Reply via email to