Hi everybody,

this XML-thing ist currently driving me nuts :) let's assume i have the 
following
XML-Tree as input:

# ------------------------- file.xml --------------------------- #
<MOVIE_LIST>
<MOVIE>
        <NAME>Name of the Movie</NAME>
        <MOVIE_ID>28372382</MOVIE_ID>
        <DESCRIPTIONS>
                <LONG_DESCRIPTION>This is a long description</LONG_DESCRIPTION>
                <SHORT_DESCRIPTION>short description</SHORT_DESCRIPTION>
        </DESCRIPTIONS>
        <DIRECTOR_LIST>
                <DIRECTOR>director 1</DIRECTOR>
                <DIRECTOR>director 2</DIRECTOR>
        </DIRECTOR_LIST>
</MOVIE>
<MOVIE>
...
</MOVIE>
</MOVIE_LIST>

# -------------------------------------------------------------- #

I'm using the Perl-module "XML::Twig" for parsing the input-file (the 
input-xml-file
is round about 200MB - so it's quite big and XML::Twig is really efficient with
memory-usage).

Extracting values from the XML-file that do not contain childs (like NAME or 
MOVIE_ID)
is working like a charm but so far i didn't manage to extract the child-objects
successfully (Like the DIRECTOR in DIRECTOR_LIST).

I'm currently working with this snipplet:

# ------------------------- snip --------------------------- #
#!/usr/bin/perl -w
use strict;
use warnings;
use XML::Twig;
use Data::Dumper;

my $file = 'file.xml';
my $twig = new XML::Twig(
                TwigHandlers => { 'MOVIE'  => \&movie },
                TwigRoots    => {MOVIE_LIST => 1},
                pretty_print => 'indented');
$twig->parsefile($file);

sub movie {
         my ($t, $elt) = @_;
        
         my $MOVIE_ID = "";
         my $NAME = "";
        
         # this is working pretty well
         $MOVIE_ID = $elt->first_child_text('FILM_ID');
         $NAME = $elt->first_child_text('NAME');
        
         push(@movies, [ $MOVIE_ID,
                         $NAME,
         ]);
                
         # reduce memory usage :-)
         $t->flush;
}

print Dumper \@movies;

# ------------------------- /snip --------------------------- #

So, how do i extract the child values within the <DIRECTOR_LIST> ?
I didn't manage so far, so any hints are very appreciated.

Thanks,
Werner

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to