Myron, You're the first to reply -- and thanks, I'll give this a try! (I'm copying this to [EMAIL PROTECTED])
-- Chris -----Original Message----- From: Myron Turner [mailto:[EMAIL PROTECTED] Sent: Saturday, June 28, 2003 9:50 AM To: Christopher Morgan Subject: Re: [Net-z3950] Question about using MARC::Charset Hi Chris, I don't know if anyone responded to you as yet. But your guess is correct--the character count is changed when converting between MARC-8 and utf8. Doing it is simple. You create a MARC::Charset object: my $cs = MARC::Charset->new(); and then pass a string of MARC-8 characters to $cs->to_utf8() In my AsyncZ.pm module I have the following function, which makes the conversion: sub _utf8 { my $index = shift; my $cs = MARC::Charset->new(); for(my $i = 0; $i < scalar(@{$results[$index]}); $i++) { $results[$index]->[$i] = $cs->to_utf8($results[$index]->[$i]); } } $results[$index]->[$i] is a string from the raw file, which is converted in place--that is, to_utf8 does the conversion and replaces the original string with the result. Myron At 01:13 PM 27/06/2003, you wrote: >Id like to use MARC::Charset to convert the records to Unicode format. I >read the pod, but am still not exactly sure how to do it. Also, I assume >that I could not just convert the raw files and keep them as raw files, >since the conversion would alter the total number of characters on the >record, thus creating an invalid raw MARC record -- is this correct? Many >thanks! > > > > > >Thanks! > >-- Chris Morgan >_______________________________________________ >Net-z3950 mailing list >[EMAIL PROTECTED] >http://www.indexdata.dk/mailman/listinfo/net-z3950