Unless I'm very much mistaken, Chris's code is outputting UTF-8 to
the terminal, not MARC-8.

The key is to find a terminal program that correctly displays UTF-8.
I doubt you'll have any trouble finding one -- for example, there
are at least two for Mac OS X alone (Terminal.app and iTerm).

Depending on your platform, freshmeat.net or tucows.com may be the
place to go.  This thread from the linux-utf8 list may also be
helpful (I googled for 'terminal UTF-8'):

http://mail.nl.linux.org/linux-utf8/2003-07/msg00231.html

Paul.

On Thursday, July 1, 2004, at 11:22  AM, Houghton,Andrew wrote:

From: Christopher Morgan [mailto:[EMAIL PROTECTED]
Sent: 01 July, 2004 10:50
Subject: Displaying diacritics in a terminal vs. a browser

I use the $cs->to_utf8 conversion from MARC::Charset to
display MARC Authority records in a browser, and the
diacritics display properly there.
But they don't display properly via SDTOUT in my terminal
window (I get two characters instead of one -- one with the
letter and one with the accent mark). Am I doing something
wrong? I'm using:

        binmode (STDOUT, ":utf8");

Is there any way around this problem, or is it a limitation
of terminal displays?

I'm not sure what MARC::Charset does internally, but MARC-8 defines the diacritic separate from the base character. So even using binmode(STDOUT,":utf8") will produce two characters, one for the base character followed by the diacritic. If you want them combined then you need to combine them.

It just so happens that I have recently been converting MARC-XML
to RDF.  The RDF specification mandates Unicode Normal form C,
which means that the base character and the diacritic are
combined.  MARC-XML uses Unicode Normal form D, which means that
the base character is separate from the diacritic.  So I hacked
together some Perl scripts to convert Unicode NFD <-> Unicode NFC.
The scripts require Perl 5.8.0.

I was talking with a colleague, just yesterday, about whether we
should unleash these on the Net...  They need to be cleaned up a
little and need some basic documentation on how to run the Perl
scripts.


Andy.

Andrew Houghton, OCLC Online Computer Library Center, Inc.
http://www.oclc.org/about/
http://www.oclc.org/research/staff/houghton.htm

--
Paul Hoffman :: Taubman Medical Library :: Univ. of Michigan
[EMAIL PROTECTED] :: [EMAIL PROTECTED] :: http://www.nkuitse.com/



Reply via email to