On Tue, 7 Feb 2012 09:42:05 -0600, Paul Gilmartin <[email protected]> wrote: >And, speaking of standards, this is a conspicuous violation by z/OS. >You know CMS. CMS Pipelines correctly translates: > > IBM-1047 ISO8859-1 > > NL 0x15 NL 0x85 > LF 0x25 LF 0x0a > >iconv(1) on Ubuntu Linux correctly does likewise. (What do the >various Linuxen for z do?) > >iconv(1) on z/OS does: > > IBM-1047 ISO8859-1 > > NL 0x15 LF 0x0a > LF 0x25 NL 0x85
The IBM Globalization Center of Competence possesses the One Ring that rules all IBM code pages. They provide the code pages and the translations. The GCoC says EBCDIC 0x15 in cp1047 should be translated to 0x85 in cp819 (8859-1), as you say. (The translation table 10470819 on z/VM gives you the GCoC-defined translation.) In the traditional System z world, we store text files as records. In that world, the 0x15 has significance only in device drivers and then, typically only to SCS printers. z/VM keeps the history alive, but I digress.... The rub is, of course, POSIX. When in a POSIX frame of mind, the 0x15 again has significance, being the IBM-chosen value for the <newline> required by the POSIX standard. Now and the POSIX translation rules apply. It's an inherent fugue state. IMO, if you use iconv in z/OS outside of USS and explicitly tell it IBM-1047 and IBM-819, it should convert it as you describe since to do otherwise destroys the ability of other platforms to reliably translate the file back to code page 1047. >I have a similar problem with the misbehavior of LC_COLLATE=En_US >in z/OS LE. IBM is trying to tell me it's an ASCII vs. EBCDIC problem. >From a POSIX perspective, collation order should be the same on all platforms. > The characters appear in a defined order, without regard to the >platform-specific code point assigned to the characters. For example, both >En_US and the POSIX locales sort numbers, then upper case, then lower case. >This is consistent with the byte sort order in ASCII. They have different >sort orders for the control characters, and POSIX only deals with 7 bits. Alan Altmark IBM ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN

