UTF-8 is just an encoding of Unicode; not a character set. All of ISO-8859-1 is part of Unicode.
Of course, the encoding of characters between U+80 and U+FF requires two octets in UTF-8. And, yes, UTF-8 is clearly the way forward, although there may be some bumps in the road. -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 ________________________________________ From: IBM Mainframe Discussion List [IBM-MAIN@LISTSERV.UA.EDU] on behalf of Andrew Rowley [and...@blackhillsoftware.com] Sent: Wednesday, July 12, 2023 9:02 PM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: Python 3.11 on z/OS - UTF-8 errors On 13/07/2023 10:01 am, David Crayford wrote: > We specify > <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> in > our Maven builds as most of the time we are building off host on > machines with UTF8 locales. However, we tag our files ISO8859-1 on z/OS ... > If we cared about the euro sign we could change it to ISO8859-15 which > is still an 8-bit character set. It’s those pesky codes above 0x7F in > UTF-8 that cause the issues. Euro was just an example, there are plenty of other UTF-8 characters. If you convert to an 8 bit character set, does it mean that any literals with codes above 0x7F are silently broken? Or does git fail to checkout? Either way, sourceEncoding=UTF8 seems like a good answer to why you might want to actually have the files encoded in UTF8. Anything else would seem to be courting unpredictable errors. -- Andrew Rowley Black Hill Software ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN