Michele Orrù added the comment: > The parser *is* rejecting control characters. It's an XML parser. See the > example in the link you posted. Ehrm, my apologies.
> That's not an XML specific issue. You are printing a byte string here, so > repr() would be the right thing to use (and is actually being used > automatically in > Py3), instead of plain printing. The fact that you are wrapping the content > in XML doesn't matter. [citation needed] After a quick scan in the documentation I did not see anything mentioning this. Instead, I see many cases in which escape chars and binary-to-text encodings are mentioned. > What I meant was: at what step of the process from creating an XML tree in > memory to serialisation is it a problem that the tree contains control > characters? > Because once the data is serialised, it will just be rejected on input by any > XML parser, and handling bytes data is a thing on its own (e.g. you could > serialise > to UTF16 and the result would contain null bytes - too bad). m, I think the problem lies in the expectation of having fromstring(tostring(tree)) = tree > Unless there is a more dangerous way to exploit this that is actually due to > XML being used, I'd suggest changing the type from "security" back to > "behaviour". > Or maybe even to "enhancement". The behaviour that it writes out what you > give it isn't exactly wrong, it's just inconvenient that you have to take > care yourself > that you pass it well-formed XML content. I think the point here is clarifying whether xml expect text or just a byte string. In case that's a stream of byte, I agree with you, is more a "behaviour" problem. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue18850> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com