On Wednesday, March 07, 2007 2:34 PM, Ron Davies wrote: >When I do this I get a number of error messages such as : >"\x{00ce}" does not map to utf8 at myprogram.pl line xxx. >and in the output file instead of the correct character there is a hex >encoding. This happens with Greek but also perfectly ordinary Latin >characters.
I can't offer any advice, but I am experiencing what may be similar difficulties. I finally had a chance to get MARC::Charset and MARC::File::XML installed and working, so I could try out xml2marc and marc2xml. After creating a test record containing a field with diacritics, I tried using marc2xml followed by xml2marc, hoping to end up with records matching the original. marc2xml appears to have successfully translated the raw MARC into MARCXML (it left the leader unchanged--no update to the record length (though it did set byte 9 to 'a' for Unicode). Unfortunately, attempting to use xml2marc on any of the .xml files I have results in an empty file. In some cases I get a message: "Cannot decode string with wide characters at C:/Perl/lib/Encode.pm line 184, <GEN1> line 1." In other cases, I get no error messages, but still have an empty file. I have tried a number of variations in the starting file: marc8.mrc->utf8.xml; utf8.mrc->utf8.xml, MarcEdit-produced .xml->Perl-produced .mrc. My system: Windows XP; ActivePerl v5.8.2 built for MSWin32-x86-multi-thread (Binary build 808) MARC::Record: 2.0 Encode: 1.9801 Are these problems related to the age of my Perl or Encode? (If I remember correctly, before switching to MARC::Record 2.0, using MARC::Record 1.39_1 and xml2marc resulted in records being output but the field containing diacritics was mangled/deleted/replaced with bad data.) Thank you for your assistance, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija