So I got curious (thanks to your convo in #code4lib). I isolated the problem to one record:

        http://www.inkdroid.org/tmp/one.dat

Your roundtrip conversion complains:

--

no mapping found at position 8 in Price : <9c> 7.99; Inv.# B 476913; Date 06/03/98; Supplier : Dawson UK; Recd 20/03/98; Contents : 1. The problem : 1. Don't bargain over positions; 2. The method : 2. Separate the people from the problem; 3. Focus on interests, not positions; 4. Invent options for mutual gain; 5. Insist on using objective criteria; 3. Yes, but : 6. What if they are more powerful? 7. What if they won't play? 8. What if they use dirty tricks? 4. In conclusion; 5. Ten questions people ask about getting to yes; g0=ASCII_DEFAULT g1=EXTENDED_LATIN at /usr/local/lib/perl5/site_perl/5.8.7/MARC/ Charset.pm line 126.

--

So I took a look at that position in the marc record and found a 0x9C character at that position, as the error message indicates. I can't find a 0x9C in either of the mapping tables that this record purports to use:

BasicLatin (ASCII): http://lcweb2.loc.gov/cocoon/codetables/42.html
Extended Latin (ANSEL): http://lcweb2.loc.gov/cocoon/codetables/45.html

Looks like you might want to preprocess those records before translating. Since this character routinely occurs in the 586 field you could use MARC::Record to remove the offending character before writing as XML.

Hope that helps somewhat. This character conversion stuff is a major pain.

//Ed

Reply via email to