From: "Mike Blezien" <[EMAIL PROTECTED]>
> we need to parse some very large XML files, approx., 900-1000KB's filesize. A 
> sample of a typical XML file can be view here that would be parsed: 
> http://projects.thunder-rain.com/uploads/000001.xml

I'm probably comming late, but the anyway ... this looks like a 
perfect task for my XML::Rules. The URL doesn't work anymore so I'm 
guessing the structure of the XML.

Using XML::Rules the code would look somewhat like this:

#!perl
use XML::Rules;

my $parser = XML::Rules->new(
  rules => [
    _default => 'content',
        tracks => 'pass no content',
        'track,sound' => 'no content array',

        product => sub {
                my ($tag, $attr) = @_;
                delete $attr->{_content};
#use Data::Dumper;
#print Dumper($attr);

                print <<"*END*";
article_number: $attr->{'article_number'}
distributor_number: $attr->{'distributor_number'}
distributor_name: $attr->{'distributor_name'}
artist: $attr->{'artist'}
ean_upc: $attr->{'ean_upc'}
set_total: $attr->{'set_total'}
*END*

                foreach my $track (@{$attr->{track}}) {
                        print "  Track: $track->{trackno}. $track->{title} 
($track-
>{setno})\n";
                        foreach my $sound (@{$track->{sound}}) {
                                print "    Sound: $sound->{file}\n     Type: 
$sound->{sound_type} 
(Codec: $sound->{codec})\n";
                        }
                }
                print "\n";

                return;
        }
  ]
);

$parser->parse(\*DATA);

__DATA__
<products>
 <product>
  <article_number>Blah blah</article_number>
  <distributor_number>Blah blah</distributor_number>
  <distributor_name>Blah blah</distributor_name>
  <artist>Blah blah</artist>
  <ean_upc>Blah blah</ean_upc>
  <set_total>Blah blah</set_total>
  <tracks>
   <number_of_tracks>2</number_of_tracks>
   <track>
    <title>Blah blah</title>
    <trackno>1</trackno>
    <setno>Blah blah</setno>
    <sound>
     <sound_type>Blah blah</sound_type>
     <codec>Blah blah</codec>
     <file>Blah blah</file>
    </sound>
   </track>
   <track>
    <title>YDFbibusdf</title>
    <trackno>2</trackno>
    <setno>Blah blah</setno>
    <sound>
     <sound_type>Blah blah</sound_type>
     <codec>Blah blah</codec>
     <file>Blah blah</file>
    </sound>
   </track>
  </tracks>
 </product>
</products>

__END__


I believe this will be even more efficient than XML::Twig.
http://xmltwig.com/article/ways_to_rome/ways_to_rome.html#todo

HTH, Jenda
===== [EMAIL PROTECTED] === http://Jenda.Krynicky.cz =====
When it comes to wine, women and song, wizards are allowed 
to get drunk and croon as much as they like.
        -- Terry Pratchett in Sourcery


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to