Re: Parsing XML

Rob Dixon Fri, 18 Jul 2008 18:15:17 -0700

Epanda wrote:
>
> Epanda wrote:
>>
>> I would like to know if we can parse XML with regexp faster than with
>> an MSXML or Xerces library ?
> 
> I just want to parse an XML and I have seen that the XML!!Parser of
> Perl based on Expat is the most faster  ofth world, I don't know Twig.
> 
> My XML is classical :
> <?xml version='1.0' encoding='ISO-8859-1'?>
> <!DOCTYPE CONF_INST SYSTEM "dtd_conf_inst.dtd">
> 
> <ROOT_NODE VERS="1.0">
>       <NODE1 TAG="VD/N1" SERIAL="3HHE">
>               <C>
>                       <ID>OM</ID>
>                       <VAL>SAT</VAL>
>               </C>
>               <C>
>                       <ID>TPS</ID>
>                       <VAL>3E+01</VAL>
>               </C>
>       </NODE1>
> </ROOT_NODE>
> 
> but can be very big.


XML::Twig is built on Expat, and is especially good at processing large files
one element at a time instead of loading the whole file into memory first. For
instance, if your data consists of multiple independent <NODE1> elements
XML::Twig can be set up to process them individually and so save memory. Take a
look here http://www.xmltwig.com/xmltwig/

But if you are hoping to write something that is faster than MSXML or Xerces you
may be unsucessful. Perl also has XML::LibXML and XML::Xerces modules as well if
you want to try those.

What do you need to do with the data? It may be possible with regular
expressions if the data is consistently formatted.

Rob

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Parsing XML

Reply via email to