The UTF war is mostly over and UTF-8 won. Just about every Web page you look at is delivered in UTF-8. UTF-8 supports all the same characters that UTF-32 supports, but the datastream is a little more than a quarter of the size of UTF-32. (Most of the transmitted characters are one byte, a few are two, a very few are three or four bytes. In UTF-32 they are all four bytes of course.)
ISTR that dB2 is all UTF-8 internally. UTF-8 is everywhere. I guess that depending upon how you look at it, it's either a feature or a bug that an ASCII-based application that is ignorant of UTF-8 nonetheless more-or-less works with it. I'd rather see an email with some garbage in it than not see the email at all, but YMMV. Charles -----Original Message----- From: IBM Mainframe Assembler List <ASSEMBLER-LIST@LISTSERV.UGA.EDU> On Behalf Of Phil Smith III Sent: Wednesday, August 27, 2025 6:13 AM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: UTF - WAS: Is HLASM efficient WAS: Telum and Spyre WAS: Vector instruction performance Ok, yeah, UTF-32 is what I meant, brainfart. Was picturing fullwords when I wrote that! But this: >UTF-8 is critically important outside of the EBCDIC enclave since the >first 128 characters are identical to US-ASCII-7. Compatibility with >decades of code is critical. ...I don't quite get. I mean, yeah, the first chunk maps to ASCII, but that doesn't mean things can just say "I handle ASCII" if they then receive UTF-8. Unless you mean all those things that mostly kinda work but show crap for anything > 128? E.g., They’re for They’re ? (Eudora, I'm looking at you! Not your fault, you died before UTF-8 was common)