Hi, Chas Emerick wrote: > I looked around for an ElementTree-specific mailing list, but found none > -- my apologies if this is too broad a forum for this question.
The lxml mailing list is always happy to receive feedback, but it's fine to ask here if it's not lxml specific. > I've been using the lxml variant of the ElementTree API. > it shares the use of a .tail attribute. I > ran headlong into this aspect of the API while doing some DOM > manipulations, and it's got me pretty confused. > > Example: > >>>> from lxml import etree as ET >>>> frag = ET.XML('<a>head<b>inside</b>tail</a>') >>>> b = frag.xpath('//b')[0] >>>> b > <Element b at 71cbe8> >>>> b.text > 'inside' >>>> b.tail > 'tail' >>>> frag.remove(b) >>>> ET.tostring(frag) > '<a>head</a>' > > As you can see, the .tail text is removed as part of the <b> element -- > but it IS NOT part of the <b> element. Yes, it is. Just look at the API. It's an attribute of an Element, isn't it? What other API do you know where removing an element from a data structure leaves part of the element behind? If you want to copy part of of removed element back into the tree, feel free to do so. > Performing the same operations with the Java DOM api > (Sorry for the Java comparison, but that's where I first cut my teeth on > XML, and that's where my expectations were formed.) > > That's a pretty significant mismatch in functionality. IMHO, DOM has a pretty significant mismatch with Python. > I ran this issue past a few people I know who've worked with and written > about ElementTree, and their response to this apparent divergence > between the ET DOM API and "standard" DOM APIs was roughly: "that's just > the way it is". It's just a matter of understanding (or getting used to) the API. You might want to stop thinking in terms of '<' and '>' and rather embrace the API itself as a way to work with the XML Infoset (rather than the XML DOM). Stefan -- http://mail.python.org/mailman/listinfo/python-list