Re: UTF - WAS: Is HLASM efficient WAS: Telum and Spyre WAS: Vector instruction performance

Charles Mills Wed, 27 Aug 2025 06:35:43 -0700

The UTF war is mostly over and UTF-8 won.

Just about every Web page you look at is delivered in UTF-8. UTF-8 supports all 
the same characters that UTF-32 supports, but the datastream is a little more 
than a quarter of the size of UTF-32. (Most of the transmitted characters are 
one byte, a few are two, a very few are three or four bytes. In UTF-32 they are 
all four bytes of course.)


ISTR that dB2 is all UTF-8 internally. UTF-8 is everywhere.

I guess that depending upon how you look at it, it's either a feature or a bug 
that an ASCII-based application that is ignorant of UTF-8 nonetheless 
more-or-less works with it. I'd rather see an email with some garbage in it 
than not see the email at all, but YMMV.

Charles

-----Original Message-----
From: IBM Mainframe Assembler List <ASSEMBLER-LIST@LISTSERV.UGA.EDU> On Behalf 
Of Phil Smith III
Sent: Wednesday, August 27, 2025 6:13 AM
To: ASSEMBLER-LIST@LISTSERV.UGA.EDU
Subject: Re: UTF - WAS: Is HLASM efficient WAS: Telum and Spyre WAS: Vector 
instruction performance

Ok, yeah, UTF-32 is what I meant, brainfart. Was picturing fullwords when I 
wrote that!

But this:
>UTF-8 is critically important outside of the EBCDIC enclave since the 
>first 128 characters are identical to US-ASCII-7. Compatibility with 
>decades of code is critical.

...I don't quite get. I mean, yeah, the first chunk maps to ASCII, but that 
doesn't mean things can just say "I handle ASCII" if they then receive UTF-8. 
Unless you mean all those things that mostly kinda work but show crap for 
anything > 128? E.g., Theyâ€™re for They’re ? (Eudora, I'm looking at you! Not 
your fault, you died before UTF-8 was common)

Re: UTF - WAS: Is HLASM efficient WAS: Telum and Spyre WAS: Vector instruction performance

Reply via email to