Tomalak added the comment:
@devon: Thanks for pointing & linking back here.
--
___
Python tracker
<http://bugs.python.org/issue5752>
___
___
Python-bugs-li
Changes by Tomalak :
--
title: xml.dom.minidom does not escape newline characters within attribute
values -> xml.dom.minidom does not escape CR, LF and TAB characters within
attribute values
___
Python tracker
<http://bugs.python.org/iss
Tomalak added the comment:
I changed the patch to include support for TAB characters, which were
also left unencoded before.
Also I switched encoding from '
' etc. to '
'. This is
equivalent, but the spec uses the latter variant.
--
__
Changes by Tomalak :
Added file: http://bugs.python.org/file13977/minidom.patch
___
Python tracker
<http://bugs.python.org/issue5752>
___
___
Python-bugs-list mailin
Changes by Tomalak :
Removed file: http://bugs.python.org/file13919/minidom.patch
___
Python tracker
<http://bugs.python.org/issue5752>
___
___
Python-bugs-list mailin
Tomalak added the comment:
Daniel Diniz:
The proposed behaviour is correct:
http://www.w3.org/TR/2000/WD-xml-c14n-2119.html#charescaping
"In attribute values, the character information items
TAB (#x9), newline (#xA), and carriage-return (#xD)
are represented by " &
Tomalak added the comment:
Francesco,
> if you want to encode the newline character,
> this should be done by both parseString and
> setAttribute methods. Otherwise, the
> behaviour is not symmetric.
I believe you still don't see the issue. The behaviour is not symmetric
*
Changes by Tomalak :
--
title: xml.dom.minidom does not handle newline characters in attribute values
-> xml.dom.minidom does not escape newline characters within attribute values
___
Python tracker
<http://bugs.python.org/iss
Tomalak added the comment:
Francesco, I think you are missing the point. :-) The problem has two sides.
If I create an XML document using the DOM (not by parsing it from a
string!), then I can put newline characters into attribute value. This
is allowed and conforms to the XML spec.
However
Changes by Tomalak :
Added file: http://bugs.python.org/file13921/minidom_test.py
___
Python tracker
<http://bugs.python.org/issue5752>
___
___
Python-bugs-list mailin
Changes by Tomalak :
Removed file: http://bugs.python.org/file13920/toxml_test.py
___
Python tracker
<http://bugs.python.org/issue5752>
___
___
Python-bugs-list mailin
Tomalak added the comment:
Attaching a test file that outlines the problem. Output on my system
(Windows / Python 3.0) is:
Without the patch:
C:\Python30>python.exe c:\minidom_test.py
False
1 -->"multiline
value"
2 -->"multiline value"
With the patch:
C:\Python30&
Tomalak added the comment:
Attaching a patch that fixes the problem.
--
keywords: +patch
Added file: http://bugs.python.org/file13919/minidom.patch
___
Python tracker
<http://bugs.python.org/issue5
Tomalak added the comment:
Hmm... I thought toxml() is the part that needs to be fixed, not the
parsing/reading. I mentioned the reading only to outline the data loss
that occurs eventually.
My point is: The toxml() (i.e. _write_data) *actually writes* the
newline to the output. And within
Tomalak added the comment:
Of course it should be:
def _write_data(writer, data, is_attrib=False):
"Writes datachars to writer."
data = data.replace("&", "&").replace("<", "<")
data = data.replace("\"&q
Tomalak added the comment:
@Francesco Sechi: Would it not just require a minimal change to the
_write_data() method? Something along the lines of (sorry, no Python
expert, maybe I am way off):
def _write_data(writer, data, is_attrib=False):
"Writes datachars to writer."
if
Changes by Tomalak :
--
type: -> behavior
___
Python tracker
<http://bugs.python.org/issue5752>
___
___
Python-bugs-list mailing list
Unsubscri
New submission from Tomalak :
Current behavior upon toxml() is:
Upon reading the document again, the new line is normalized and
collapsed into a space (according to the XML spec, section 3.3.3), which
means that it is lost.
Better behavior would be something like this (within attribute
18 matches
Mail list logo