Hello,
we need to parse some very large XML files, approx., 900-1000KB's filesize. A
sample of a typical XML file can be view here that would be parsed:
http://projects.thunder-rain.com/uploads/000001.xml
I was planning on using the XML::Twig module to do this, using the following
code snip to loop through each of the <product> .... </product> elements. Not
every single element is needed but most within each loop of each
<product></product>
# Code snip:
####################################################################
my $xmlfile = '/path/to/upload/000001.xml';
my $cgi = new CGI();
my $twig = new XML::Twig(twig_handlers => {
product => \&get_products,
});
$twig->parsefile("$xmlfile");
sub get_products {
my($t,$elt) = @_;
# loop through each product.
my $article_number = $elt->first_child_text('article_number');
my $ean_upc = $elt->first_child_text('ean_upc');
my $distributor_number = $elt->first_child_text('distributor_number');
my $distributor_name = $elt->first_child_text('distributor_name');
my $artist = $elt->first_child_text('artist');
# now loop through each
<tracks><number_of_tracks></number_of_tracks><playtime></playtime>
# <track> <sound> </sound> </track></tracks> for each product.
# <number_of_tracks> element determines total <tracks> .. <track> <sound>
</sound> </track> .. </tracks>
# # in loop.
$t->purge();
}
exit();
#################################################################
Now the areas I'm have alot of problem is with the elements within each product,
the
<tracks> .... </tracks> and looping through each of the tracks child elements
and <sound></sound>
---------
<product>
.......
<tracks>
<number_of_tracks></number_of_tracks><playtime></playtime>
<track> ....
<sound> ..
</sound>
</track>
</tracks>
........
</product>
--------
Is there a better way to do this to obtain all the data within each of the
<product> ... </product> elements? I've never really worked with XML files this
large and complex tree. Any help or suggestions would be much appreciated.
TIA
Mike(mickalo)Blezien
===============================
Thunder Rain Internet Publishing
Providing Internet Solution that Work
===============================
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/