So, in the course of dumping a pile of MARC bib records to MARCXML I ran into
a funky legacy record. It was 8-bit encoded (probably MARC-8), but it
matched well enough with ANSI (ISO-8859-1) (which is a valid encoding for
XML) that I decided to just dump it with that encoding.
"But how?" one might ask. Well, here's a patch that allows it. The extra 3
lines of perl, plus some perldoc to explain it, will allow you to set the
encoding of the output XML to anything you like, and defaults to UTF-8 so as
not to change the current functionallity.
Feedback welcome and encouraged. :)
--
miker
--- /usr/lib/perl5/site_perl/5.8.5/MARC/File/XML.pm 2004-05-19 22:21:03.000000000 -0400
+++ XML.pm 2004-09-22 22:51:27.816511864 -0400
@@ -201,11 +201,16 @@ different portions.
Returns a string of XML to use as the header to your XML file.
+This method takes an optional $encoding parameter to set the output encoding
+to something other than 'UTF-8'. This is meant mainly to support slightly
+broken records that are in ISO-8859-1 (ANSI) format with 8-bit characters.
+
=cut
sub header {
+ my $encoding = shift || 'UTF-8';
return( <<MARC_XML_HEADER );
-<?xml version="1.0" encoding="UTF-8"?>
+<?xml version="1.0" encoding="$encoding"?>
<collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd" xmlns="http://www.loc.gov/MARC21/slim">
MARC_XML_HEADER
}
@@ -325,19 +330,23 @@ sub decode {
}
-=head2 encode()
+=head2 encode([$encoding])
You probably want to use the as_marc() method on your MARC::Record object
instead of calling this directly. But if you want to you just need to
pass in the MARC::Record object you wish to encode as XML, and you will be
returned the XML as a scalar.
+This method takes an optional $encoding parameter to set the output encoding
+to something other than 'UTF-8'. This is meant mainly to support slightly
+broken records that are in ISO-8859-1 (ANSI) format with 8-bit characters.
+
=cut
sub encode {
my $record = shift;
my @xml = ();
- push( @xml, header() );
+ push( @xml, header(shift) );
push( @xml, record( $record ) );
push( @xml, footer() );
return( join( "\n", @xml ) );