On 3/28/07, Jackie Shieh <[EMAIL PROTECTED]> wrote:
Mike,
Attached is the questionable marc record.
The error message came from the command line.
% marcdump 50987256.mrc
I believe we have upgraded to the most recent version
(v.2) from CPAN. What is the current MARC::Charset module
we should have?
The newest version on CPAN is 0.96, release just a couple weeks ago.
http://search.cpan.org/~mikery/MARC-Charset-0.96/lib/MARC/Charset.pm
After sending my query, I kept coming across more
records on the same Charset not map issue.
Thanks for your help!
That record is /definitely/ UTF-8 encoded, which means there's no need
to use MARC::Charset for it. f there is a mix of records that are
UTF-8 and MARC8 encoded you can add
MARC::Charset->ignore_errors(1);
MARC::Charset->assume_encoding('UTF-8');
to the top of your script to fall back to UTF-8 if an encoding error
is encountered. This, of course, assumes that the non-MARC8 encoding
actually is UTF-8.
Let us know if that helps!
--Jackie
On Wed, 28 Mar 2007, Mike Rylander wrote:
> On 3/28/07, Jackie Shieh <[EMAIL PROTECTED]> wrote:
>>
>> I am looking at a set of 7000+ records-- 514th rec is a record
>> that contains transliteration for Amharic (Ethiopian) on corporate
>> body. MARC::Record does not have a map to it. See attached
>> screen shot.
>>
>> utf8 "\xAE" does not map to Unicode at
>> /usr/lib/perl5/5.8.6/i386-linux-thread-multi/Encode.pm line 166.
>>
>> Have you come across something like this? How did you get
>> around it?! Thanks for your help!
>
> Looking at the code map for MARC8, it seems this record is in fact
> MARC8 encoded. We need to confirm what code of yours is using Perl's
> Encode module, but my guess is that it's a very old MARC::Charset
> module. Can you show us a simplified example script that exhibits
> this behavior?
>
> TIA
>
> --
> Mike Rylander
> [EMAIL PROTECTED]
> GPLS -- PINES Development
> Database Developer
> http://open-ils.org
>
--
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org