On Mon, Jun 9, 2008 at 5:39 PM, Christopher Morgan <[EMAIL PROTECTED]> wrote:
> Jonathan,
>
> Many thanks. I get no errors on the command line or in the error log when I
> run the script. The file just executes with no output. If you have the time
> to run it, I've included the scriupt below, and have attached the name
> authority record it tries to process:

The problem is that the SAX parser is looking for the element Name
instead of LocalName.  I've attached a patch that tests both LocalName
and NamespaceURI.  If you could apply this to your version of
MARC/File/SAX.pm and give it a test, and it works for you, I'll commit
it to the CVS repo.

--miker

>
> #! /usr/bin/perl
> use strict;
>
> use MARC::Record;
> use MARC::Batch;
> use MARC::File::XML;
> use constant MAX => 20;
>
> MARC::File::XML->default_record_format('UNIMARCAUTH');
> my $batch = MARC::Batch->new( 'XML', 'name_authority_file');
> while (my $record = $batch->next()) {
>      for my $field ($record->field("100")){
>            my $name= $field->subfield('a');
>            print "$name", "\n";
>     }
> }
>
> I think you're right about the LOC files -- they probably got the extra
> spaces by accident. That's easy enough to fix.
>
> As far as the name authorities go, if I can't get MARC::File::XML to process
> them, I can always use XML::Tokeparser. Not as elegant, but it would get the
> job done.
>
> - Chris
>
> -----Original Message-----
> From: Jonathan Gorman [mailto:[EMAIL PROTECTED]
> Sent: Monday, June 09, 2008 4:43 PM
> To: Christopher Morgan; perl4lib@perl.org
> Subject: Re: Can't parse MARC Authority XML files with mx: prefixes in their
> tags
>
>
>
>>However, I'm having trouble parsing the name authority records online
>>at http://alcme.oclc.org/eprintsUK/index.html
>
> [snipped code examples]
>>
>>There are "mx:" prefixes in all the tags. What format is this? Is there
>>any way I can get MARC::File::XML to parse these files?
>
> The prefixes are the namespace.  The parser should be able to handle this,
> but I don't honestly know if it does it correctly.  What also might be the
> problem is the second namespace in there.  It might help us if you included
> some information about what is not working (what error are you getting etc).
> I don't have the time right now to run my own test, but actual error
> messages might provide some clue.
>
>>A related question: When I first tried to process the subject authority
>>files from the LOC (in my first example, above), the program complained
>>that the "Leader must be 24 bytes long".
>
> Right, that comes from the MARC specification, there are 24 bytes.
>
>>XML files are five years old. I wonder if the XML spec has changed
>>since
>>then?)
>
> Doubt it, again it doesn't have anything really to do with the XML spec but
> the underlying xml record.  More likely it is some error in creating the
> files.  Can't give any more info though, sorry.
>
> Jon Gorman
>



-- 
Mike Rylander
 | VP, Research and Design
 | Equinox Software, Inc. / The Evergreen Experts
 | phone: 1-877-OPEN-ILS (673-6457)
 | email: [EMAIL PROTECTED]
 | web: http://www.esilibrary.com
Index: MARC/File/SAX.pm
===================================================================
RCS file: /cvsroot/marcpm/marc-xml/lib/MARC/File/SAX.pm,v
retrieving revision 1.6
diff -p -u -r1.6 SAX.pm
--- MARC/File/SAX.pm	27 Nov 2007 20:28:18 -0000	1.6
+++ MARC/File/SAX.pm	10 Jun 2008 15:54:47 -0000
@@ -17,16 +17,17 @@ use MARC::Charset qw(utf8_to_marc8);
 
 sub start_element {
     my ( $self, $element ) = @_;
-    my $name = $element->{ Name };
-    if ( $name eq 'leader' ) { 
+    my $name = $element->{ LocalName };
+    my $ns = $element->{ NamespaceURI };
+    if ( $name eq 'leader' and $ns eq 'http://www.loc.gov/MARC21/slim' ) { 
 	$self->{ tag } = 'LDR';
-    } elsif ( $name eq 'controlfield' ) {
+    } elsif ( $name eq 'controlfield' and $ns eq 'http://www.loc.gov/MARC21/slim' ) {
 	$self->{ tag } = $element->{ Attributes }{ '{}tag' }{ Value };
-    } elsif ( $name eq 'datafield' ) { 
+    } elsif ( $name eq 'datafield' and $ns eq 'http://www.loc.gov/MARC21/slim' ) { 
 	$self->{ tag } = $element->{ Attributes }{ '{}tag' }{ Value };
 	$self->{ i1 } = $element->{ Attributes }{ '{}ind1' }{ Value };
 	$self->{ i2 } = $element->{ Attributes }{ '{}ind2' }{ Value };
-    } elsif ( $name eq 'subfield' ) { 
+    } elsif ( $name eq 'subfield' and $ns eq 'http://www.loc.gov/MARC21/slim' ) { 
 	$self->{ subcode } = $element->{ Attributes }{ '{}code' }{ Value };
     }
 }
@@ -34,7 +35,8 @@ sub start_element {
 sub end_element { 
     my ( $self, $element ) = @_;
     my $name = $element->{ Name };
-    if ( $name eq 'subfield' ) { 
+    my $ns = $element->{ NamespaceURI };
+    if ( $name eq 'subfield' and $ns eq 'http://www.loc.gov/MARC21/slim' ) { 
 	push @{ $self->{ subfields } }, $self->{ subcode };
 	
 	if ($self->{ transcode }) {
@@ -45,13 +47,13 @@ sub end_element { 
 
 	$self->{ chars } = '';
 	$self->{ subcode } = '';
-    } elsif ( $name eq 'controlfield' ) { 
+    } elsif ( $name eq 'controlfield' and $ns eq 'http://www.loc.gov/MARC21/slim' ) { 
 	$self->{ record }->append_fields(
 	    MARC::Field->new( $self->{ tag }, $self->{ chars } )
 	);
 	$self->{ chars } = '';
 	$self->{ tag } = '';
-    } elsif ( $name eq 'datafield' ) { 
+    } elsif ( $name eq 'datafield' and $ns eq 'http://www.loc.gov/MARC21/slim' ) { 
 	$self->{ record }->append_fields( 
 	    MARC::Field->new( 
 		$self->{ tag }, 
@@ -65,7 +67,7 @@ sub end_element { 
 	$self->{ i2 } = '';
 	$self->{ subfields } = [];
 	$self->{ chars } = '';
-    } elsif ( $name eq 'leader' ) { 
+    } elsif ( $name eq 'leader' and $ns eq 'http://www.loc.gov/MARC21/slim' ) { 
 	my $ldr = $self->{ chars };
 	$self->{ transcode }++
 		if (substr($ldr,9,1) eq 'a' and $self->{toMARC8});

Reply via email to