[issue36407] xml.dom.minidom wrong indentation writing for CDATA section

Vladimir Surjaninov Sat, 23 Mar 2019 08:39:32 -0700

New submission from Vladimir Surjaninov <vsurjani...@gmail.com>:

If we are writing xml with CDATA section and leaving non-empty indentation and 
new-line parameters, a parent node of the section will contain useless 
indentation, that will be parsed as a text.


Example:
>>>doc = minidom.Document()
>>>root = doc.createElement('root')
>>>doc.appendChild(root)
>>>node = doc.createElement('node')
>>>root.appendChild(node)
>>>data = doc.createCDATASection('</data>')
>>>node.appendChild(data)
>>>print(doc.toprettyxml(indent=‘  ‘ * 4)
<?xml version="1.0" ?>
<root>
    <node>
<![CDATA[</data>]]>    </node>
</root>

If we try to parse this output doc, we won’t get CDATA value correctly.

Following code returns a string that contains only indentation characters:
>>>doc = minidom.parseString(xml_text)
>>>doc.getElementsByTagName('node')[0].firstChild.nodeValue

Returns a string with CDATA value and indentation characters:
>>>doc.getElementsByTagName('node')[0].firstChild.wholeText


But we have a workaround:
>>>data.nodeType = data.TEXT_NODE
…
>>>print(doc.toprettyxml(indent=‘  ‘ * 4)
<?xml version="1.0" ?>
<root>
    <node><![CDATA[</data>]]></node>
</root>

It will be parsed correctly:
>>>doc.getElementsByTagName('node')[0].firstChild.nodeValue
</data>

But I think it will be better if we fix the writing function, which would set 
this as default behavior.

----------
components: XML
messages: 338681
nosy: vsurjaninov
priority: normal
severity: normal
status: open
title: xml.dom.minidom wrong indentation writing for CDATA section
type: enhancement

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue36407>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36407] xml.dom.minidom wrong indentation writing for CDATA section

Reply via email to