[EMAIL PROTECTED] wrote: > I think I ran into a bug in the XML SAX parser. > > part of my program consist of reading a rather large XML file (about > 10Mb) containing a few thousand elements. > I have the following problem. Sometimes that SAX parses misreads a > line. > Let me explain: the XML file contains a few thousand lines like this: > " > <TargetRef>WINOSSPI:Storage@@n91c90a.cmc.com</TargetRef> > " > where 'n91c90a.cmc.com' is the name of a system and thus changes per > system. > I a few cases, the SAX parser misreads the line. The parser sometimes > plits characters the line in: > "WINOSSPI:Storage@@n" and "91c90a.cmc.com". > I put a 'print characters' line in the 'characters' method of the > parser that is how I found out. > It only happens for a few of the thousand lines but you can imagine > that is very annoying. > > I checked for errors in the XML file but the file seems ok. > > Is this a bug or am I doing something wrong?
it's not a bug; the parser is free to split up character runs (due to buffering, entities or character references, etc). it's up to you to merge character runs into strings. </F> -- http://mail.python.org/mailman/listinfo/python-list