Paul, No, I do not accept the premises you set out.
I will try, when I have more time, to make clear why with examples. Briefly, effective rules for encoding any 'character' recognized as a Unicode one as a 'longer' UTF-8 one do not in general exist. Moreover, even when they are available, my experience with them has been bad. In dealing recently with a document containing mixed English, German, Korean and Japanese text I found that the UTF-8 version was 23% longer than the UTF-16 version. John Gilmore, Ashland, MA 01721 - USA ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
