From: "Jenda Krynicky" <je...@krynicky.cz>

> From:           "Octavian Rasnita" <orasn...@gmail.com>
> To:             <beginners@perl.org>
> Subject:        Fast XML parser?
> Date sent:      Thu, 25 Oct 2012 14:33:15 +0300
> 
>> Hi,
>> 
>> Can you recommend an XML parser which is faster than XML::Twig?
>> 
>> I need to use an XML parser that can parse the XML files chunk by chunk and 
>> which works faster (much faster) than XML::Twig, because I tried using this 
>> module but it is very slow.
>> 
>> I tried something like the code below, but I have also tried a version
>> that just opens the file and parses it using regular expressions,
>> however the unelegant regexp version is 25 times faster than the one
>> which uses XML::Twig, and it also uses less memory. 
>> 
>> If you think there is a module for parsing XML which would work faster
>> than regular expressions, or if I can substantially improve the
>> program which uses XML::Twig  then please tell me about it. If regexp
>> will still be faster, I will use regexp. 
> 
> You did not specify what do you want to do with the lexemes anyway 
> you might try something like this:
> 
> use strict;
> use XML::Rules;
> use Data::Dumper;
> 
> my $parser = XML::Rules->new(
> stripspaces => 7,
> rules => {
> _default => 'content',
> InflectedForm => 'as array',
> Lexem => sub {
> #print Dumper($_[1]);
> print "$_[1]->{Form}\n";
> foreach (@{$_[1]->{InflectedForm}}) {
> print "  $_->{InflectionId}: $_->{Form}\n";
> }
> },
> }
> );
> 
> $parser->parse(\*DATA);
> 
> __DATA__
> <?xml version="1.0" encoding="UTF-8"?>
> <Lexems>
>  <Lexem id="1">
> ...
> 
> XML::Rules sits on top of XML::Parser::Expat so I would not expect 
> this to be 25 times faster than XML::Twig, but it might be a bit 
> quicker. Or not.
> 
> Jenda



Hi Jenda,

I tried your program above, modified as below, but it gives the error:

Free to wrong pool 3967d8 not 20202020 at e:/usr/lib/XML/Parser/Expat.pm line 
470.

I was able to install XML::Rules under Windows using cpanm with no problems, so 
it should be working...

The program:

use strict;
use XML::Rules;
use Data::Dumper;

my $parser = XML::Rules->new(
stripspaces => 7,
rules => {
_default => 'content',
InflectedForm => 'as array',
Lexem => sub {
#print Dumper($_[1]);
#print "$_[1]->{Form}\n";
foreach (@{$_[1]->{InflectedForm}}) {
#print "  $_->{InflectionId}: $_->{Form}\n";
}
},
}
);

my $file = '/path/to/file.xml';

open my $xml, '<:utf8', $file or die "Cannot open $file: $!";

$parser->parse( $xml );


Thanks.

Octavian


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to