RE: [Net-z3950] Question about using MARC::Charset

Christopher Morgan Sat, 28 Jun 2003 09:40:08 -0700

Myron,

You're the first to reply -- and thanks, I'll give this a try! (I'm
copying this to [EMAIL PROTECTED])


-- Chris

-----Original Message-----
From: Myron Turner [mailto:[EMAIL PROTECTED] 
Sent: Saturday, June 28, 2003 9:50 AM
To: Christopher Morgan
Subject: Re: [Net-z3950] Question about using MARC::Charset

Hi Chris,

I don't know if anyone responded to you as yet.  But your guess is 
correct--the character count is changed when converting between MARC-8
and 
utf8.   Doing it is simple.   You create a MARC::Charset object:
                my $cs = MARC::Charset->new();

and then pass a string of MARC-8 characters to
                                     $cs->to_utf8()

In my AsyncZ.pm module I have the following function, which makes the 
conversion:

    sub _utf8 {
       my $index = shift;
       my $cs = MARC::Charset->new();
      for(my $i = 0; $i < scalar(@{$results[$index]}); $i++) {
                  $results[$index]->[$i] = 
$cs->to_utf8($results[$index]->[$i]);
     }
  }


$results[$index]->[$i] is a string from the raw file, which is converted
in 
place--that is, to_utf8 does the conversion and replaces the original 
string with the result.


Myron


At 01:13 PM 27/06/2003, you wrote:


>Id like to use MARC::Charset to convert the records to Unicode format.
I 
>read the pod, but am still not exactly sure how to do it. Also, I
assume 
>that I could not just convert the raw files and keep them as raw files,

>since the conversion would alter the total number of characters on the 
>record, thus creating an invalid raw MARC record -- is this correct?
Many 
>thanks!
>
>
>
>
>
>Thanks!
>
>-- Chris Morgan
>_______________________________________________
>Net-z3950 mailing list
>[EMAIL PROTECTED]
>http://www.indexdata.dk/mailman/listinfo/net-z3950

RE: [Net-z3950] Question about using MARC::Charset

Reply via email to