On 02/03/2022 18:39, Dieter Maurer wrote:
Robin Becker wrote at 2022-3-2 15:32 +0000:
I'm using lxml.etree.XMLParser and would like to distinguish

<tag></tag>

from

<tag/>

I seem to have e.getchildren()==[] and e.text==None for both cases. Is there a 
way to get the first to have e.text==''

I do not think so (at least not without a DTD):

I have a DTD which has

<!ELEMENT tag (content)*>

so I guess the empty case is allowed as well as the self closed.

I am converting from an older parser which has text=='' for <tag></tag> and text==None for the self closed version. I don't think I really need to make the distinction. However, I wonder how lxml can present an empty string content deliberately or if that always has to be a semantic decision.

`<t

ag/>' is just a shorthand notation for '<tag></tag>' and
the difference has no influence on the DOM.

Note that `lxml` is just a Python binding for `libxml2`.
All the parsing is done by this library.
yes I think I knew that
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to