"Kurt Klinner" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Hello, > > while trying to parse a "large" XML document i found a > strange behaviour of the Parser Module(s) (XML::Parser:PerlSAX, > XML::Parser, XML::Parser::Expat > > If my file XML file is larger then 65536 bytes > the actual character string is interrupted and a whitespace > is added. > > Any ideas how to avoid /fix that problem
As noted in another post, per the SAX spec a parser may fire multiple characters() events for a single set of characters defined in a XML node. All my SAX filters use XML::Filter::BufferText as its first filter. This filter put all characters() in one event: http://search.cpan.org/author/RBERJON/XML-Filter-BufferText-1.01/BufferText.pm That dosen't explain why there is a space at that point. My guess would be that you are in fact appending a space to each of your characters() events before you forward the node down stream. Perhaps you could post that handler? I am very curious. Could you post an example program? You could generate 70k of sample cdata by using the 'x' operator: $doc = '<doc><![CDATA[' . 'abc<def&hij>klm' x 50_000 . ']]></doc>'; Todd W. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]