Aaron,
I happen to agree with Dan about the unwieldiness of replacing
characters with their full names during character translation, but your
idea of using Unicode equivalents seems more palatable. I'm going to
ignore the issue of how this method of handling errors fits into the
scheme of possible error-handling methods, for the moment, because I
want to talk about that in a separate email. Having said that, I have a
few specific questions about some of your design choices. It's late and
I'm tired, so I'm probably a bit incoherent. If you have trouble
understanding me, let me know and I'll try to clarify. So, first:
How are you going to choose between different canonical compositions and
compatability compositions when such a choice has to be made? For
example, when encoding combining characters, vertically oriented text,
or Korean jamo vs. syllables, how will you pick between the four
different normalization forms?
If a transparent conversion is required to get a string into Unicode
before transforming out of it, do we print the source character or its
Unicode equivalent if an error occurs?
Michael
- Re: What Unicode means to us Nicholas Clark
- Re: What Unicode means to us Aaron Sherman
- Re: What Unicode means to us Dan Sugalski
- Re: What Unicode means to us Aaron Sherman
- Re: What Unicode means to us Dan Sugalski
- Re: What Unicode means to us Larry Wall
- Questions about Exceptions & Re:... Michael Stone
- Re: Questions about Exceptions &... Larry Wall
- Re: What Unicode means to us Adam Richardson
- Re: What Unicode means to us Dan Sugalski
- Michael Stone