On 02/03/2022 18:39, Dieter Maurer wrote:
Robin Becker wrote at 2022-3-2 15:32 +0000:
I'm using lxml.etree.XMLParser and would like to distinguish
<tag></tag>
from
<tag/>
I seem to have e.getchildren()==[] and e.text==None for both cases. Is there a
way to get the first to have e.text==''
I do not think so (at least not without a DTD):
I have a DTD which has
<!ELEMENT tag (content)*>
so I guess the empty case is allowed as well as the self closed.
I am converting from an older parser which has text=='' for <tag></tag> and text==None for the self closed version. I
don't think I really need to make the distinction. However, I wonder how lxml can present an empty string content
deliberately or if that always has to be a semantic decision.
`<t
ag/>' is just a shorthand notation for '<tag></tag>' and
the difference has no influence on the DOM.
Note that `lxml` is just a Python binding for `libxml2`.
All the parsing is done by this library.
yes I think I knew that
--
https://mail.python.org/mailman/listinfo/python-list