jia li, 28.07.2010 12:10:
I have an XML file with hundreds of<error> elements.
What's strange is only one of there elements could not be parsed correctly:
<error>
<checker>REVERSE_INULL</checker>
<function>Dispose_ParameterList</function>
<unmangled_function>Dispose_ParameterList</unmangled_function>
<status>UNINSPECTED</status>
<num>146</num>
<home>1/146MMSLib_LinkedList.c</home>
</error>
I printed the data in "characters(self, data)" and after parsing. The result
is one "\r\n" is inserted after "1/" and "146MMSLib_LinkedList.c" for the
latter.
But if I make my XML file only this element left, it could parse correctly.
First of all: don't use SAX. Use ElementTree's iterparse() function. That
will shrink you code down to a simple loop in a few lines.
Then, the problem is likely that you are getting separate events for text
nodes. The "\r\n" most likely only occurs due to your print statement, I
doubt that it's really in the data returned from SAX. Again: using
ElementTree instead of SAX will avoid this kind of problem.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list