On 3/28/07, Jackie Shieh <[EMAIL PROTECTED]> wrote:
Mike, Attached is the questionable marc record. The error message came from the command line. % marcdump 50987256.mrc I believe we have upgraded to the most recent version (v.2) from CPAN. What is the current MARC::Charset module we should have?
The newest version on CPAN is 0.96, release just a couple weeks ago. http://search.cpan.org/~mikery/MARC-Charset-0.96/lib/MARC/Charset.pm
After sending my query, I kept coming across more records on the same Charset not map issue. Thanks for your help!
That record is /definitely/ UTF-8 encoded, which means there's no need to use MARC::Charset for it. f there is a mix of records that are UTF-8 and MARC8 encoded you can add MARC::Charset->ignore_errors(1); MARC::Charset->assume_encoding('UTF-8'); to the top of your script to fall back to UTF-8 if an encoding error is encountered. This, of course, assumes that the non-MARC8 encoding actually is UTF-8. Let us know if that helps!
--Jackie On Wed, 28 Mar 2007, Mike Rylander wrote: > On 3/28/07, Jackie Shieh <[EMAIL PROTECTED]> wrote: >> >> I am looking at a set of 7000+ records-- 514th rec is a record >> that contains transliteration for Amharic (Ethiopian) on corporate >> body. MARC::Record does not have a map to it. See attached >> screen shot. >> >> utf8 "\xAE" does not map to Unicode at >> /usr/lib/perl5/5.8.6/i386-linux-thread-multi/Encode.pm line 166. >> >> Have you come across something like this? How did you get >> around it?! Thanks for your help! > > Looking at the code map for MARC8, it seems this record is in fact > MARC8 encoded. We need to confirm what code of yours is using Perl's > Encode module, but my guess is that it's a very old MARC::Charset > module. Can you show us a simplified example script that exhibits > this behavior? > > TIA > > -- > Mike Rylander > [EMAIL PROTECTED] > GPLS -- PINES Development > Database Developer > http://open-ils.org >
-- Mike Rylander [EMAIL PROTECTED] GPLS -- PINES Development Database Developer http://open-ils.org