Michele Orrù added the comment:

> The parser *is* rejecting control characters. It's an XML parser. See the 
> example in the link you posted.
Ehrm, my apologies.

> That's not an XML specific issue. You are printing a byte string here, so 
> repr() would be the right thing to use (and is actually being used 
> automatically in 
> Py3), instead of plain printing. The fact that you are wrapping the content 
> in XML doesn't matter.
[citation needed] 
After a quick scan in the documentation I did not see anything mentioning this. 
Instead, I see many cases in which escape chars and binary-to-text encodings 
are mentioned.

> What I meant was: at what step of the process from creating an XML tree in 
> memory to serialisation is it a problem that the tree contains control 
> characters? 
> Because once the data is serialised, it will just be rejected on input by any 
> XML parser, and handling bytes data is a thing on its own (e.g. you could 
> serialise 
> to UTF16 and the result would contain null bytes - too bad).
m, I think the problem lies in the expectation of having 
fromstring(tostring(tree)) = tree

> Unless there is a more dangerous way to exploit this that is actually due to 
> XML being used, I'd suggest changing the type from "security" back to 
> "behaviour".
> Or maybe even to "enhancement". The behaviour that it writes out what you 
> give it isn't exactly wrong, it's just inconvenient that you have to take 
> care yourself 
> that you pass it well-formed XML content.
I think the point here is clarifying whether xml expect text or just a byte 
string. In case that's a stream of byte, I agree with you, is more a 
"behaviour" problem.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18850>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to