Thanks everyone for the help thusfar. Ed and I have been chatting
on code4lib ... it seems there are two problems. One is with
the 9C character, which I now have a workaround for. I added the
following to Charset.pm line 151:
if ($marc8 =~ /\x{9C}/) {
$utf8 .= ' ';
> So I took a look at that position in the marc record and
> found a 0x9C character at that position, as the error
> message indicates. I can't find a 0x9C in either of the
> mapping tables that this record purports to use:
0x9C is a C1 control character that is generally assigned the function
of
Hi Ed,
Interesting ... when I run marcdump I get:
Recs Errs Filename
- -
192 0 sample.mrc
Here's the file posted on a web server (maybe a problem with
the list truncating the attachment?):
http://liblime.com/public/sample.mrc
Could you try downloading from there and ru
Edward Summers wrote:
On May 18, 2006, at 6:48 AM, Joshua Ferraro wrote:
Anyway, if anyone can shed some light on this I'd be grateful.
I believe the data loss you are seeing is due to your source
records--not to do with character translation.
Just a quick look but I think in many cases the
So I got curious (thanks to your convo in #code4lib). I isolated the
problem to one record:
http://www.inkdroid.org/tmp/one.dat
Your roundtrip conversion complains:
--
no mapping found at position 8 in Price : <9c> 7.99;Inv.# B
476913;Date 06/03/98; Supplier : Dawson UK;
On May 18, 2006, at 10:03 AM, Joshua Ferraro wrote:
http://liblime.com/public/sample.mrc
Could you try downloading from there and running marcdump again?
Yes that one has the same amount of records but now passes through
marcdump fine. Now, when running your script I get a lot of warnings
On May 18, 2006, at 6:48 AM, Joshua Ferraro wrote:
Anyway, if anyone can shed some light on this I'd be grateful.
I believe the data loss you are seeing is due to your source records--
not to do with character translation. Just running marcdump on them
generates a ton of errors (see below).