How to convert from ANSEL/MARC-8 to UTF-8?

Michael Lackhoff Wed, 07 Jan 2009 08:43:08 -0800

Hello,

I would like to search the NLM with a Z39.50 Perl client. This is
working quite well
with one exception: I have not yet found a way to convert the charset to
UTF-8.
First I tried the standard Encode module because I thought it uses iconv
and that should
have a definition for ANSEL but I got an error message about unknown
encoding.
Then I tried MARC::Charset. That works somehow but has two problems:
First the
database is really huge (400 MB) but even more important it doesn't
convert combining
diakritics + base char to the combined character. So I still have two
characters for e.g. the
German umlauts. This might be correct UTF-8 but is not useable to
present in (X)HTML.
Is there any other option short of  doing it by hand with lots of s///
for at least the most common
combinations?


Thanks for any help
-Michael

How to convert from ANSEL/MARC-8 to UTF-8?

Reply via email to