From: "Shlomi Fish" <shlo...@shlomifish.org> On Mon, 29 Oct 2012 10:09:53 +0200 Shlomi Fish <shlo...@shlomifish.org> wrote:
> Hi Octavian, > > On Sun, 28 Oct 2012 17:45:15 +0200 > "Octavian Rasnita" <orasn...@gmail.com> wrote: > > > From: "Shlomi Fish" <shlo...@shlomifish.org> > > > > Hi Octavian, > > > > > > > > Hi Shlomi, > > > > I tried to use XML::LibXML::Reader which uses the pool parser, and I read > > that: > > > > "" > > However, it is also possible to mix Reader with DOM. At every point the > > user may copy the current node (optionally expanded into a complete > > sub-tree) from the processed document to another DOM tree, or to > > instruct the Reader to collect sub-document in form of a DOM tree > > "" > > > > So I tried: > > > > use XML::LibXML::Reader; > > > > my $xml = 'path/to/xml/file.xml'; > > > > my $reader = XML::LibXML::Reader->new( location => $xml ) or die "cannot > > read $xml"; > > > > while ( $reader->nextElement( 'Lexem' ) ) { > > my $id = $reader->getAttribute( 'id' ); #works fine > > > > my $doc = $reader->document; > > > > my $timestamp = $doc->getElementsByTagName( 'Timestamp' ); #Doesn't > > work well > > my @lexem_text = $doc->getElementsByTagName( 'Form' ); #Doesn't work > > fine > > > > } > > > > I'm not sure you should do ->document. I cannot tell you off-hand how to do it > right, but I can try to investigate when I have some spare cycles. > OK, after a short amount of investigation, I found that this program works: [CODE] use strict; use warnings; use XML::LibXML::Reader; my $xml = 'Lexems.xml'; my $reader = XML::LibXML::Reader->new( location => $xml ) or die "cannot read $xml"; while ( $reader->nextElement( 'Lexem' ) ) { my $id = $reader->getAttribute( 'id' ); #works fine my $doc = $reader->copyCurrentNode(1); my $timestamp = $doc->getElementsByTagName( 'Timestamp' ); my @lexem_text = $doc->getElementsByTagName( 'Form' ); } [/CODE] Note that you can also use XPath for looking up XML information. Regards, Shlomi Fish -- ----------------------------------------------------------------- Shlomi Fish http://www.shlomifish.org/ I followed the way you suggested, and it works fine, however it is very slow. I've done: while ( $reader->nextElement( 'Lexem' ) ) { my $id = $reader->getAttribute( 'id' ); my $doc = $reader->copyCurrentNode(1); my $timestamp = $doc->findnodes( 'Timestamp' ); my $lexem_text = $doc->findnodes( 'Form' ); my $inflected_forms = $doc->findnodes( 'InflectedForm' ); for my $inflected_form ( $inflected_forms->get_nodelist ) { my $inflection_id = $inflected_form->findnodes( './InflectionId' ); my $inflection_dia = $inflected_form->findnodes( './Form' ); } } I tried to find a way of using XPath but I couldn't find a good one, and it seems that copy of that node takes a pretty long time. Octavian -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/