"Kurt Klinner" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Hello,
>
> while trying to parse a "large" XML document i found a
> strange behaviour of the Parser Module(s) (XML::Parser:PerlSAX,
> XML::Parser, XML::Parser::Expat
>
> If my file XML file is larger then 65536 bytes
> the actual character string is interrupted and a whitespace
> is added.
>
> Any ideas how to avoid /fix that problem

As noted in another post, per the SAX spec a parser may fire multiple
characters() events for a single set of characters defined in a XML node.
All my SAX filters use XML::Filter::BufferText as its first filter. This
filter put all characters() in one event:

http://search.cpan.org/author/RBERJON/XML-Filter-BufferText-1.01/BufferText.pm

That dosen't explain why there is a space at that point. My guess would be
that you are in fact appending a space to each of your characters() events
before you forward the node down stream. Perhaps you could post that
handler?

I am very curious. Could you post an example program? You could generate 70k
of sample cdata by using the 'x' operator:

$doc = '<doc><![CDATA[' . 'abc<def&hij>klm' x 50_000 . ']]></doc>';

Todd W.



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to