Hello,

we need to parse some very large XML files, approx., 900-1000KB's filesize. A sample of a typical XML file can be view here that would be parsed: http://projects.thunder-rain.com/uploads/000001.xml

I was planning on using the XML::Twig module to do this, using the following code snip to loop through each of the <product> .... </product> elements. Not every single element is needed but most within each loop of each <product></product>

# Code snip:
####################################################################
my $xmlfile = '/path/to/upload/000001.xml';
my $cgi     = new CGI();
my $twig = new XML::Twig(twig_handlers => {
                                          product => \&get_products,
                                         });
$twig->parsefile("$xmlfile");

sub get_products {
my($t,$elt) = @_;
# loop through each product.

 my $article_number     = $elt->first_child_text('article_number');
 my $ean_upc            = $elt->first_child_text('ean_upc');
 my $distributor_number = $elt->first_child_text('distributor_number');
 my $distributor_name   = $elt->first_child_text('distributor_name');
 my $artist             = $elt->first_child_text('artist');

# now loop through each <tracks><number_of_tracks></number_of_tracks><playtime></playtime>
   # <track> <sound> </sound> </track></tracks> for each product.
# <number_of_tracks> element determines total <tracks> .. <track> <sound> </sound> </track> .. </tracks>
#  # in loop.

$t->purge();
}

exit();
#################################################################

Now the areas I'm have alot of problem is with the elements within each product, the <tracks> .... </tracks> and looping through each of the tracks child elements and <sound></sound>
---------
<product>
.......
<tracks>
<number_of_tracks></number_of_tracks><playtime></playtime>
  <track> ....
     <sound> ..
     </sound>
  </track>
</tracks>
........
</product>
--------

Is there a better way to do this to obtain all the data within each of the <product> ... </product> elements? I've never really worked with XML files this large and complex tree. Any help or suggestions would be much appreciated.

TIA
Mike(mickalo)Blezien
===============================
Thunder Rain Internet Publishing
Providing Internet Solution that Work
===============================
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to