Every release that I have ever used has had issues with translation, ever since the original ASCII support for tape.
Do you mean UTF-8 or UTF-EBCDIC (https://en.wikipedia.org/wiki/UTF-EBCDIC)? -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 ________________________________________ From: IBM Mainframe Discussion List <IBM-MAIN@LISTSERV.UA.EDU> on behalf of Robert Prins <robert.ah.pr...@gmail.com> Sent: Saturday, February 9, 2019 7:30 PM To: IBM-MAIN@LISTSERV.UA.EDU Subject: XMIT Manager and CP1047 (or rather CP1046.9921875) Over the last week or so I've been having a discussion with Denis Molony about his XmitApp, a platform-agnostic, it's written in Java, viewer for XMIT files, see <https://secure-web.cisco.com/1sGiqbJi6xPegTGfvBq-5fvDzZXiHLZRIFWwfhZcbQ7xHIIF-oYWrF9lZ453qSHkanquaFCQBmVwm8bc6xWlSfgpLvq1C4U_dU3aM1EFeBy-cMdvVc1BkN6E04sLGaTo0iSUl5F1NlWpXzafcSyfiF99hv5gOwlbolqGzpZC0tRAUSCWqDqHf1Kk_WqKl9_ZhOG0vZAJXuFle6oRdo9mthGVF4GSmxEQD_mqovMeouD2oT_dKkAZT3Cc491cKtCCHogt4H6L2_nu73gXhcLgjYGff5ikJhEjSHu4NE3Y6NrEQ3nHG4vlf4MEFTSi2jqZv_iklTCpjjJkr6X5HNhHmBgjBeCVeXq1k6rOeY80opEneiL_XQcQj49ije6UETovhPuiEtNspnRzWrW8C4JbIfqhTwVNq8cxCrI9YZCmzl1NGG0EzpEiym4zI8IABRzHVrD0dbXXQ1fXqHOw7nk2Dow/https%3A%2F%2Fgithub.com%2Fdmolony%2FXmit> As is, it (currently) only shows the contents of the xmit file in one of the panels, and he's hit a snag. One of my PDS's he's using contains text that comes from uploaded-to-z/OS UTF-8 encoded text (which basically means all UTF-8 characters are mangled beyond recognition on z/OS). It's processed on z/OS, and the results, also containing UTF-8 encoded text is downloaded to Windoze (unmangling the mangled mess again), but XmitApp using CP1047 screws up codepoints 0x15 and 0x25, and if you take a look at those two code points on <https://secure-web.cisco.com/18hCOhdKREIfSBoXg3BWVhrEVYbfoxRwEazTVRg4XOSpw1TtrXsc2bt7xy8crg8mlNq8h3pJYTX7c3pMqb_cJc7qZim9Lupe5id8V5-2nm7NU3uCk24k_8OiufjrMVg2IJ1mY_4P1LCcSOGR_yLzUKVHwI8VxlsQgAUkDAmizvPfQCJDFyIrNwa1r5GChMGHL7tEN86ltn2Xzg5K8izeMiTFO7l35lDLRPyuwIOM7TJMdoP8cUvbo_tByz7peauZM_tJTHRe8b6KmL1S4im1g1l0C-kyv3HBfsyJKekzmaby6gH5Lg2eUrlu5gAR1OEKG0cRmgs5D4PP2n8m4MmMDGQxGQXPtKOm1tgq4_VhYO_sWM_nYuwazYwiDjeayxPplElSD5P1frQa-WWobAblKDbL7o9UhGl5GsQasK7x6ieAVQ2riEXIFHDm9YrRSX9a0/https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FEBCDIC_1047>, shiver... 0x15 is NL (Newline) (Unicode 0085) 0x25 is LF (Linefeed) (Unicode 000A) I never realised that EBCDIC had two of the same, just different... Extract the files with Neil Johnston-Ward's XMIT Manager from the cbttape.org site @ <http://secure-web.cisco.com/1M98BaYTQs2Y-kUqoAKmj2AnuevDUM4AydWf_qGDVUb3JS453s8k6Rbh4qKSQAHI_52ixRFUg5OSN0AblVt3rf_IOmsY34-ggaEYgzWx5W6VkhVkuyBMSujZCFkIwdMyHrMioJeVHm8RmOZCyEyooEcjbK8oT9al8mZ6uzkikbhwE-ne59km-Mqg_wkD6s-QmRCpRleckaM56EGlSCbGBdma87MVVNioT5DOh0YpLcrUqvLb5WbFZ7GNMqcjTOdNRbvfCtF_wX7xs3UlcqKNn8vEjzrinMlJax8susJk9sGBt5_hgW9086Im5M9mVAZM8gvUMIt7dEJwvdsko-nAOeKWt5SsIP6Z-ZnPn3G8xB3_RXuQzZhBppakvWSkTjhhYyOANlJ3ivR2y6c5xZMIiA8BTpVexrZGYKOBl3MIpngck33Il_i8-Hqtd3LAOU31u/http%3A%2F%2Fwww.cbttape.org%2Fnjw%2Findex.html>, and the UTF-8 encoded characters show up OK. Do it with the official CP1047 and they don't. So load XMIT Manager.exe into a hex-editor, I'm (still) using HxD 1.7.7.0 from <https://secure-web.cisco.com/1O5oH-Pa1gWbpp9RxOSKGTERAAq5nN7pw5VU26-8_YrSziigBIKft81-oBqqrHUy06qEC1fKC73gbHtWjDPWDVkssYe2Sii3QTZEo5fCjJ3sk9GLe3KATt0dr3GTdXba3JQcAEP1k8_Tarlmrbj66UPxcH0wuLH_Q17aG3XbELtHcM5VldFimiJL1ftegE6fyRxUdQMyYoRZP1tFJkMCj4QV0hC7BtKFO75dwilWiWcVnWpBi1v1A5fE7TBLuKvyExQXjgDRpUPD9QvIIdYtxU-ZQRkM1XZoHcWZXGrNoG_MyOT1mp2IZxp3NyUHG6S6zHj-5e6NxPlJMfvdMgKXVP4Lkc5Gu68zQAaljwuNLsGLDRCwJfuW3_7odW2rkqm64xOcR2Vs2AmNX1aMd31Zebe67-EHvTEG-RitIJzARUCsIXektTK6ADvVIQGduITs4/https%3A%2F%2Fmh-nexus.de%2Fen%2Fhxd%2F>, and look for the translate table NJW uses (just do a find for 'abcdef') and you'll see that he has swapped the ASCII characters for the 0x15 and 0x25 code points from those in the "official" CP1047... Denis has found an APAR dating back to 2010, <https://secure-web.cisco.com/1G70c7GALlSMgpXyns0HV3hZYX1_0B6TG-ae6p8-MzRRQfhUKCv49w1qD0W6QNUGKKz-5c_WVuPxVjxpfzwWymJ36R2CuUySyijykMbuXqTorakHeZYJ9WY5JxUyZcHHlF2o-b1zdU8o1UvJQjfSpdFwamb0blSf1V_s4HPTwQWldkPdmobCditnuFJ9xHDL64O2a_exXUk2z0hBsAoZK0LXJIw-SdpUhz63NJxmdX_0AHD7Ty6CnOSNqJXOrHCJR0at9Ed2fnoa_uYkS6woP1OCH1DfQkPGj6whkCWLBMMjlUMFwF0_odJUmGRfMmtA7NXo-1lMM7MABCwfI9iTEfWt_ZCBPglMspFvLfkqYnZGdKJIlp1s4v7cNOBl6fi4PCxsL3GAPdCX1oP8tx6NBvtF709RKXXFdbNsYqyvpj90h6y46PZHhoTI8YyArr24Y/https%3A%2F%2Fwww-01.ibm.com%2Fsupport%2Fdocview.wss%3Fuid%3Dswg1IZ70874>, that seems to confirm that, for Java in mixed environments, i.e. z/OS vs little white boxes, NJW is correct in swapping them. Can anyone provide any more insights? For what it's worth, I'm currently restricted to doing the round-trip transfers using IND$FILE (Upload as ASCII, download of XMIT (obviously) binary), but I would appreciate if anyone can check what happens if they are done using FTP or the WSA. I've attached, in the hope it survives, utf-8.zip.txt with a bit of UTF-8 encoded data to experiment with. It's all the UTF-8 encoded data that's in use in the test file, and consists of European (and a few Japanese) place names. Robert -- Robert AH Prins robert.ah.prins(a)gmail.com ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN