On Mon, May 8, 2017 at 9:37 AM, Charles Mills <[email protected]> wrote:
> UTF-8 is great for datastreams but a PITA to deal with in a language or an > application program. > > UTF-16 is the worst of both worlds -- uses roughly double the space of > UTF-8 but still you can't quite deal with the characters as though they > were fixed size. Worse, if you do pretend to deal with them as fixed size, > it mostly works. > > What about a language concept where data was externalized as UTF-8 but > presented to the program logic internally as UTF-32? With automatic, > transparent re-encoding back-and-forth for externalization? > I think that Java does that sort of thing, except that it uses UTF-16 instead of UTF-32. ref: https://docs.oracle.com/javase/7/docs/api/java/nio/charset/Charset.html ref: https://docs.oracle.com/javase/7/docs/api/java/lang/Character.html > > Charles > > -- Advertising is a valuable economic factor because it is the cheapest way of selling goods, particularly if the goods are worthless. -- Sinclair Lewis Maranatha! <>< John McKown ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
