Re: MARC::Charset problem

Edward Summers Thu, 22 Jun 2006 11:47:29 -0700

On Jun 22, 2006, at 5:34 AM, [EMAIL PROTECTED] wrote:

I'm using MARC::Charset::marc8_to_utf8() v0.95 to transcode some
Library of Congress data to utf8, however I'm finding a problem with
character 'ø' (hex 0xB2 - lowercase scandinavian o / latin small

letter o with stroke), this character is transcoding to 0xF8 -which is

not valid utf8 - when it should transcode to 0xC3B8. (According to the

documentation, 0xF8 seems to be the ucs transcoding of thischaracter).


Is this a bug in MARC::Charset or am I missing something?

Well I tried this out in the debugger with perl 5.8.7 andMARC::Charset v0.95:


--

  main::(-e:1):   1
    DB<1> use MARC::Charset qw(marc8_to_utf8)

    DB<2> $utf8 = marc8_to_utf8(chr(0xB2));

    DB<3> print "works" if $utf8 eq chr(0xF8);
  works

--

So it appears to be working fine. Perhaps when you are writing outyour data you aren't preparing the filehandle for utf8? Can youprovide a simple test script that demonstrates the problem so otherscan try to replicate?


//Ed


Thanks,

Michael

Re: MARC::Charset problem

Reply via email to